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COMPARATIVE GENE TRANSCRIPT ANALYSIS 



FIELD OF INVENTTOM 
The present invention is in the field of molecular 
biology and computer science; more particularly, the 
present invention describes methods of analyzing gene 
transcripts and diagnosing the genetic expression of cells 
and tissue. 



2 * BACKGROUND OF THB TNVENTTnM 

Until very recently, the history of molecular biology 
10 has been written one gene at a time. Scientists have 
observed the cell's physical changeS/ ^ 
from the cell or its milieu, purified proteins, sequenced 
proteins and therefrom constructed probes to look for the 
. corresponding gene. 

15 Recently, different nations have set up massive 

projects to sequence the billions of bases in the human 
genome. These projects typically begin with dividing the 
genome into large portions of chromosomes and then 
determining the sequences of these pieces, which are then 

20 analyzed for identity with known proteins or portions 

thereof, known as motifs. Unfortunately, the majority of 
genomic DNA does not encode proteins and though it is 
postulated to have some effect on the cell's ability to 
make protein, its relevance to medical applications is not 

25 understood at this time. 

A third methodology involves sequencing only the 
transcripts encoding the cellular machinery actively ' 

inV lT^ n makin9 Pr ° tein ' namely the The Vantage 

is that the cell has already edited out all the non-codino 
30 DNA, and it is relatively easy to identify the protein^ 
coding portion of the rna. The utility of this approach 
was not immediately obvious to genomic researchers, m 
fact, when cDNA sequencing was initially proposed, the 
method was roundly denounced by those committed to genomic 
35 sequencing. Por example, the head of the U.S. Human Genome 
project discounted CDNA sequencing as not valuable and 
refused to approve funding of projects. 

in this disclosure, we teach methods for analyzing 
DNA, including cDNA libraries. Based on our analyses Ind 
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research, we see each individual gene product 7s a "pixel" 
of information, which relates to the expression of that, 
and only that, gene. We teach herein, methods whereby 'th e 
individual ..pixels" of gene expression information can be 
5 combined into a single gene transcript "image," in which 
each of the individual genes can be visualized 
simultaneously and allowing relationships between the gene 
pixels to be easily visualized and understood. 

We further teach a new method which we call electronic 
10 subtraction. Electronic subtraction will enable the gene 
researcher to turn a single image into a moving picture 
one vhich describes the temporality or dynamics of gene' 
expression, at the level of a cell or a whole tissue, it 
is that sense of "motion" of cellular machinery on the 
15 scale of a cell or organ which constitutes the new 

invention herein. This constitutes a new view into the 
process of living cell physiology and one which holds great 
promise to unveil and discover new therapeutic and 
diagnostic approaches in medicine. 
20 We teach another method which we call "electronic 

northern, - which tracks the expression of a single gene 
across many types of cells and tissues. 

Nucleic acids (DMA and RNA) carry within their 
sequence the hereditary information and are therefore the 
25 prime molecules of life. Nucleic acids are found in all 
living organisms including bacteria, fungi, viruses, plants 
and animals. it is of interest to determine the relative 
abundance of different discrete nucleic acids in different 
cells, tissues and organisms over time under various 
30 conditions, treatments and regimes. 

AH dividing cells in the human body contain the same 
set of 23 pairs of chromosomes. It is estimated that these 
autosomal and sex chromosomes encode approximately ioo 000 
genes. The differences among different types of cells'are 
35 believed to reflect the differential expression of the 
100,000 or so genes. Fundamental questions of biology 
could be answered by understanding which genes are 
transcribed and knowing the relative abundance of 
transcripts in different cells. 
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Previously, the art has only provided for the analysis 
of a few known genes at a time by standard molecular 
biology techniques such as PCR, northern blot analysis, or 
other types of DNA probe analysis such as in situ 
5 hybridization. Each of these methods allows one to analyze 
the transcription of only known genes and/or small numbers 
of genes at a time. „ucl. Acids Res. 19, 7097-7104 (1991) . 
Nucl. Acids Res. is, 4833-42 (1990); Hud. Acids Res. ±8, 
2789-92 (1989); European J. Neuroscience £, 1063-1073 
10 (1990); Analytical Biochem. Ul, 364-73 (1990); Genet 
Annals Techn. Appl. 2 , 64-70 (1990); GATA 8(4), 129-33 
(1991); Proc. Natl. Acad. Sci . USA 85, 1696-1700 (1988)- 
Nucl. Acids Res. !9, 1954 (1991); Proc. Natl. Acad. Sci 
USA 11, 1943-47 (1991); Nucl. Acids Res. i9, 6123-27 
15 (1991); Proc. Natl. Acad. Sci. USA 85, 5738-42 (1988); 
Nucl. Acids Res. n, 10937 (1988). 

Studies of the number and types of genes whose 
transcription is induced or otherwise regulated during cell 
processes such as activation, differentiation, aging, viral 
20 transformation, morphogenesis, and mitosis have been 

pursued for many years, using a variety of methodologies 
One of the earliest methods was to isolate and analyze 
levels of the proteins in a cell, tissue, organ system, or 
even organisms both before and after the process of 
25 interest, one method of analyzing multiple proteins in a 
sample is using 2-dimensional gel electrophoresis, wherein 
proteins can be, in principle, identified and quantified as 
individual bands, and ultimately reduced to a discrete 
signal. At present, 2-dimensional analysis only resolves 
30 approximately 15* of the proteins. m order to positively 
analyze those bands which are resolved, each band must be 
excised from the membrane and subjected to protein sequence 
analysis using Edman degradation. Unfortunately, most of 
the bands were present in quantities too small to obtain a 
reliable sequence, and many of those bands contained more 
than one discrete protein. An additional difficulty is 
that many of the proteins were blocked at the 
amino-terminus, further complicating the sequencing 
process. 
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-Analyzing differentiation at "the gene transcription 
level has overcome many of these disadvantages and 
drawbacks, since the power of recombinant DNA technology 
allows amplification of signals containing very small 
5 amounts of material. The most common method, called 
"hybridization subtraction," involves isolation of mRNA 
from the biological specimen before (B) and after (A) the 
developmental process of interest, transcribing one set of 
mRNA into cDNA, subtracting specimen B from specimen A 
(mRNA from cDNA) by hybridization, and constructing a cDNA 
library from the non-hybridizing mRNA fraction. Many 
different groups have used this strategy successfully, and 
a variety of procedures have been published and improved 
upon using this same basic scheme. Nucl. Acids Res 19 
7097-7104 (1991,; Nucl. Acids Res. 18, 4833-42 (1990)~' 
• Nucl. Acids Res. is, 2789-92 (1989); European J. 
Neuroscience 2, 1063-1073 (1990, ; Analytical Biochem. ^ 
364-73 (1990); Genet. Annals Techn. Appl. 7, 64-70 (1990, • 
GATA 8,4,, 129-33 (1991); ProC . Natl. Acad. Sci. USA 85 
1696-1700 (1988); Nucl. Acids Res. 19, 1954 (1991); Prlc 
Natl. Acad. Sci. USA 88, i 94 3-47 (1991); Nucl. Acids Res' 
29. 6123-27 (1991); p roc . Natl. Acad. Sci. USA 85, 5738-42 
(1988); Nucl. Acids Res. 16, 10937 (1988). 

Although each of these techniques have particular 
strengths and weaknesses, there are still some limitations 
and undesirable aspects of these methods: First, the time 
and effort required to construct such libraries is quite 
large. Typically, a trained molecular biologist might 
expect construction and characterization of such a library 
to require 3 to 6 months, depending on the level of skill 
experience, and luck. Second, the resulting subtraction ' 
libraries are typically inferior to the libraries 
constructed by standard methodology, a typical 
conventional cDNA library should have a clone complexity of 
at least 10« clones, and an average insert size of 1-3 kB 
in contrast, subtracted libraries can have complexities of 
10 or 10 and average insert sizes of 0.2 kB. Therefore 
there can be a significant loss of clone and sequence 
information associated with such libraries. Third this 
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approach allows the researcher to capture only the genes 
induced in specimen A relative to specimen B, n t 
vice-versa, nor does it easily allow comparison to a third 
specimen of interest (C) . Fourth, this approach requires 
5 very large amounts (hundreds of micrograms) of "driver" 
mRNA (specimen B) , which significantly limits the number 
and type of subtractions that are possible since many 
tissues and cells are very difficult to obtain in large 
quantities. 

10 Fifth, the resolution of the subtraction is dependent 

upon the physical properties of DNA : DNA or RNA : DNA 
hybridization. The ability of a given sequence to find a 
hybridization match is dependent on its unique CoT value 
The COT value is a function of the number of copies 
15 (concentration) of the particular sequence, multiplied by 
the tame of hybridization. it follows that for sequences 
which are abundant, hybridization events will occur very 
rapidly (low CoT value) , while rare sequences will form 
duplexes at very high CoT values. CoT values which allow 
20 such rare sequences to form duplexes and therefore be 
effectively selected are difficult to achieve in a 
convenient time frame. Therefore, hybridization 
subtraction is simply not a useful technique with which to 
study relative levels of rare mRNA species. sixth, this 
25 problem as further complicated by the fact that duplex 
formation is also dependent on the nucleotide base 
composition for a given sequence. Those sequences rich in 
G + C form stronger duplexes than those with high contents 
of A + T. Therefore, the former sequences will tend to be 
30 removed selectively by hybridization subtraction. Seventh 
it is possible that hybridization between nonexact matches' 
can occur. When this happens, the expression of a 
homologous gene may "mask" expression of a gene of 
interest, artificially skewing the results for that 
35 particular gene. 

Matsubara and Okubo proposed using partial cDNA 
sequences to establish expression profiles of genes which 
could be used in functional analyses of the human genome 
Matsubara and Okubo warned against using random priming as 
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it creates multiple unique DNA fragments from individual 
mRNAs and may thus skew the analysis of the number of 
particular mRNAs per library. They sequenced randomly 
selected members from a 3 '-directed cDNA library and 
5 established the frequency of appearance of the various 
ESTs. They proposed comparing lists of ESTs from various 
cell types to classify genes. Genes expressed in many 
different cell types were labeled housekeepers and those 
selectively expressed in certain cells were labeled cell- 
10 specific genes, even in the absence of the full sequence of 
the gene or the biological activity of the gene product. 

The present invention avoids the drawbacks of the 
prior art by providing a method to quantify the relative 
abundance of multiple gene transcripts in a given 
15 biological specimen by the use of high-throughput. 

sequence-specific analysis of individual RNAs and/or their 
corresponding cDNAs. 

The present invention offers several advantages over 
current protein discovery methods which attempt to isolate 
20 individual proteins based upon biological effects. The 
method of the instant invention provides for detailed 
diagnostic comparisons of cell profiles revealing numerous 
changes an the expression of individual transcripts. 

The instant invention provides several advantages over 
25 current subtraction methods including a more complex 
library analysis (io* to io 7 clones as compared to 10 J 
clones) which allows identification of low abundance 
messages as well as enabling the identification of messages 
which either increase or decrease in abundance. These 
30 large libraries are very routine to make in contrast to the 
libraries of previous methods, m addition, homologues can 
easily be distinguished with the method of the instant 
invention. 

This method is very convenient because it organizes a 
large quantity of data into a comprehensible, digestible 
format. The most significant differences are highlighted 
by electronic subtraction. In depth analyses are made more 
convenient. 
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The present invention provides several advantages over 
previous methods of electronic analysis of cDNA. Th 
method is particularly powerful when more than 100 and 
preferably more than 1,000 gene transcripts are analyzed. 
5 In such a case, new low-frequency transcripts are 
discovered and tissue typed. 

High resolution analysis of gene expression can be 
used directly as a diagnostic profile or to identify 
disease-specific genes for the development of more classic 
10 diagnostic approaches. 

This process is defined as gene transcript frequency 
analysis. The resulting quantitative analysis of the gene 
transcripts is defined as comparative gene transcript 
analysis. 

15 3 - SUMMARY OF THE TNVENTION 

The invention is a method of analyzing a specimen 
containing gene transcripts comprising the steps of (a) 
producing a library of biological sequences; (b) generating 
a set of transcript sequences, where each of the transcript 
20 sequences in said set is indicative of a different one of 
the biological sequences of the library; (c) processing the 
transcript sequences in a programmed computer (in which a 
database of reference transcript sequences indicative of 
reference sequences is stored) , to generate an identified 
25 sequence value for each of the transcript sequences, where 
each said identified sequence value is indicative of 
sequence annotation and a degree of match between one of 
the biological sequences of the library and at least one of 
the reference sequences; and (d) processing each said 
30 identified sequence value to generate final data values 

indicative of the number of times each identified sequence 
value is present in the library. 

The invention also includes a method of comparing two 
specimens containing gene transcripts. The first specimen 
35 is processed as described above. The second specimen is 
used to produce a second library of biological sequences, 
which is used to generate a second set of transcript 
sequences, where each of the transcript sequences in the 
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second set is indicative of one of the biological sequences 
of the second library. Then the second set of transcript 
sequences is processed in a programed computer to generate 
a second set of identified sequence values, namely the 
5 further identified sequence values, each of which is 

indicative of a sequence annotation and includes a degree 
of match between one of the biological sequences of the 
second library and at least one of the reference sequences 
The further identified sequence values are processed to 
10 generate further final data values indicative of the number 
of times each further identified sequence value is present 
in the second library. The final data values from the 
first specimen and the further identified sequence values 
from the second specimen are processed to generate ratios 
15 of transcript sequences, which indicate the differences in 
the number of gene transcripts between the two specimens. 

In a further embodiment, the method includes 
quantifying the relative abundance of mRNA in a biological 
specimen by (a) isolating a population of mRNA transcripts 
20 from a biological specimen; (b) identifying genes from 
which the mRNA was transcribed by a sequence-specific 
method; ( C ) determining the numbers of mRNA transcripts 
corresponding to each of the genes; and (d) using the mRNA 
transcript numbers to determine the relative abundance of 
25 mRNA transcripts within the population of mRNA transcripts 
Also disclosed is a method of producing a gene 
transcript image analysis by first obtaining a mixture of 
mRNA, from which cDNA copies are made. The cDNA is 
inserted into a suitable vector which is used to transfect 
30 suitable host strain cells which are plated out and 

permitted to grow into clones, each cone representing a 
unique mRNA. A representative population of clones 
transfected with cDNA is isolated. Each clone in the 
population is identified by a sequence-specific method 
35 which identifies the gene from which the unique mRNA was 
transcribed. The number of times each gene is identified 
to a clone is determined to evaluate gene transcript 
abundance. The genes and their abundances are listed in 
order of abundance to produce a gene transcript image 
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In a further embodiment, the relative abundance of the 
gene transcripts in one cell type or tissue is compared 
with the relative abundance of gene transcript numbers in a 
second cell type or tissue in order to identify the 
5 differences and similarities. 

In a further embodiment, the method includes a system 
for analyzing a library of biological sequences including a 
means for receiving a set of transcript sequences, where 
each of the transcript sequences is indicative of a 

10 different one of the biological sequences of the library; 
and a means for processing the transcript sequences in a 
computer system in which a database of reference transcript 
sequences indicative of reference sequences is stored, 
wherein the computer is programmed with software for 

15 generating an identified sequence value for each of the 
transcript sequences, where each said identified sequence 
value is indicative of a sequence annotation and the degree 
of match between a different one of the biological 
sequences of the library and at least one of the reference 

20 sequences, and for processing each said identified sequence 
value to generate final data values indicative of the 
number of times each identified sequence value is present 
in the library. 

In essence, the invention is a method and system for 

25 quantifying the relative abundance of gene transcripts in a 
biological specimen. The invention provides a method for 
comparing the gene transcript image from two or more 
different biological specimens in order to distinguish 
between the two specimens and identify one or more genes 

30 which are differentially expressed between the two 
specimens. Thus, this gene transcript image and its 
comparison can be used as a diagnostic. One embodiment of 
the method generates high-throughput sequence-specific 
analysis of multiple RNAs or their corresponding cDNAs: a 

35 gene transcript image. Another embodiment of the method 

produces the gene transcript imaging analysis by the use of 
high-throughput cDNA sequence analysis. In addition, two 
or more gene transcript images can be compared and used to 
detect or diagnose a particular biological state, disease, 
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or condition which is correlated to the relative abundance 
of gene transcripts in a given cell or population of cells. 

*' DESCRIPTION OF THE T&P LES AND BPWthbc 
4.1. TABLES 

5 Table j, presents a detailed explanation of the letter 

codes utilized in Tables 2-5. 

Tahle 1 lists the one hundred most common gene 
transcripts, it is a partial list of isolates from the 
HUVEC cDNA library prepared and sequenced as described 
10 below. The left-hand column refers to the sequence's order 
of abundance in this table. The next column labeled 
"number" is the clone number of the first HUVEC sequence 
identification reference matching the sequence in the 
"entry" column number. Isolates that have not been 
15 sequenced are not present in Table 2. The next column 

labeled "N", indicates the total number of cDNAs which'have 
the same degree of match with the sequence of the reference 
transcript in the "entry" column. 

The column labeled "entry" gives the NIH GEN BANK locus 
name, which corresponds to the library sequence numbers 
The column indicates in a few cases the species of ih. 

reference sequence. The code for column "s" is given in 
Table 1. The column labeled "descriptor" provides a plain 
English explanation of the identity of the sequence 
25 corresponding to the NIH GEN BANK locus name in the -entry- 
column. * 

Table 3 is a comparison of the top fifteen most 
abundant gene transcripts in normal monocytes and activated 
macrophage cells . 

30 Tables is a detailed summary of library subtraction 

analysis summary comparing the THP-l and human macrophage 
cDNA sequences. In Table 4, the same code as in Table 2 is 
used. Additional columns are for "bgfreq" (abundance 
number in the subtractant library), "rfend" (abundance 
number in the target library) and "ratio" (the target 
abundance number divided by the subtractant abundance 
number). As is clear from perusal of the table, when the 
abundance number in the subtractant library is « 0 « the 
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target abundance number is divided by 0.05. This is a way 
of obtaining a result (not possible dividing by 0) and 
distinguishing the result from ratios of subtractant 
numbers of 1. 

5 Table 5 is the computer program, written in source 

code, for generating gene transcript subtraction profiles. 

Table_i is a partial listing of database entries used 
in the electronic northern blot analysis as provided by the 
present invention. 

10 

4 ' 2 - BRIEF DE SCRIPTION OF THE DRAffiKRf ; 

fiqure 3 is a chart summarizing data collected and 
stored regarding the library construction portion of 
sequence preparation and analysis. 
15 E jqure 2 is a diagram representing the sequence of 

operations performed by "abundance sort" software in a 
class of preferred embodiments of the inventive method. 

FiqUre 3 is a block digram of a preferred embodiment 
of the system of the invention. 

20 Fiqure A is a more detailed block diagram of the 

bioinformatics process from new sequence (that has already 
been sequenced but not identified) to printout of the 
transcript imaging analysis and the provision of database 
subscriptions . 



25 S * DETAILED DESCRTPTT QN OF THE INVENTTnw 

The present invention provides a method to compare the 
relative abundance of gene transcripts in different 
biological specimens by the use of high-throughput 
sequence-specific analysis of individual RNAs or their 
30 corresponding cDNAs (or alternatively, of data representing 
other biological sequences) . This process is denoted 
herein as gene transcript imaging. The quantitative 
analysis of the relative abundance for a set of gene 
transcripts is denoted herein as "gene transcript image 
35 analysis" or "gene transcript frequency analysis". The 
present invention allows one to obtain a profile for gene 
transcription in any given population of cells or tissue 
from any type of organism. The invention can be applied to 
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obtain a profile of a specimen consisting of a~ single cell 
(or clones of a single cell), or of many cells, or of 
tissue more complex than a single cell and containing 
multiple cell types, such as liver. 
5 The invention has significant advantages in the fields 

of diagnostics, toxicology and pharmacology, to name a few. 
A highly sophisticated diagnostic test can be performed on 
the ill patient in whom a diagnosis has not been made. A 
biological specimen consisting of the patient's fluids or 
10 tissues is obtained, and the gene transcripts are isolated 
and expanded to the extent necessary to determine their 
identity. Optionally, the gene transcripts can be 
converted to cDNA. A sampling of the gene transcripts are 
subjected to seguence-specif ic analysis and quantified. 
15 These gene transcript sequence abundances are compared 
against reference database sequence abundances including 
normal data sets for diseased and healthy patients. The 
patient has the disease (s) with which the patient's data 
set most closely correlates. 
20 For example, gene transcript frequency analysis can be 

used to differentiate normal cells or tissues from diseased 
cells or tissues, just as it highlights differences between 
normal monocytes and activated macrophages in Table 3. 

In toxicology, a fundamental question is which tests 
25 are most effective in predicting or detecting a toxic 

effect. Gene transcript imaging provides highly detailed 
information on the cell and tissue environment, some of 
which would not be obvious in conventional, less detailed 
screening methods. The gene transcript image is a more 
30 powerful method to predict drug toxicity and efficacy. 
Similar benefits accrue in the use of this tool in 
pharmacology. The gene transcript image can be used 
selectively to look at protein categories which are 
expected to be affected, for example, enzymes which 
35 detoxify toxins. 

In an alternative embodiment, comparative gene 
transcript frequency analysis is used to differentiate 
between cancer cells which respond to anti-cancer agents 
and those which do not respond. Examples of anti-cancer 
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agents are tamoxifen, vincristine; vinblastine, 
podophyllotoxins, etoposide, tenisposide, cisplatin, 
biologic response modifiers such as interferon, 11-2, GM- 
CSF, enzymes, hormones and the like. This method also 
5 provides a means for sorting the gene transcripts by 
functional category, m the case of cancer cells, 
transcription factors or other essential regulatory 
molecules are very important categories to analyze across 
different libraries. 
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In yet another embodiment, comparative gene transcript 
frequency analysis is used to differentiate between control 
liver cells and liver cells isolated from patients treated 
with experimental drugs like FIAU to distinguish between 
pathology caused by the underlying disease and that caused 
15 by the drug. 

In yet another embodiment, comparative gene transcript 
frequency analysis is used to differentiate between brain 
tissue from patients treated and untreated with lithium. 
In a further embodiment, comparative gene transcript 
20 frequency analysis is used to differentiate between 
cyclosporin and FK506-treated cells and normal cells. 

In a further embodiment, comparative gene transcript 
frequency analysis is used to differentiate between virally 
infected (including HIV-infected) human cells and 
25 uninfected human cells. Gene transcript frequency analysis 
is also used to rapidly survey gene transcripts in HIV- 
resistant, HIV-infected, and Hiv-sensitive cells. 
Comparison of gene transcript abundance will indicate the 
success of treatment and/or new avenues to study. 
30 in a further embodiment, comparative gene transcript 

frequency analysis is used to differentiate between 
bronchial lavage fluids from healthy and unhealthy patients 
with a variety of ailments. 

In a further embodiment, comparative gene transcript 
frequency analysis is used to differentiate between cell 
plant, microbial and animal mutants and wild-type species 
In addition, the transcript abundance program is adapted to 
permit the scientist to evaluate the transcription of one 
gene in many different tissues. Such comparisons could 
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assemble sequenced DNA fragments into Assemblages, a 
special grouping of data where the relationships between 
sequences are shown by graphic overlap, alignment and 
statistical views. The process is based on the 
5 Meyers-Kececioglu model of fragment assembly (INHERIT™ 
Assembler User's Manual, Applied Biosystems, i nc . , Foster 
City, CA) , and uses graph theory as the foundation of a 
very rigorous multiple sequence alignment engine for 

to iZTcTi DNA r guence fra9roents - other asse * bi * 

L0 that can be used include MEGALIGN (available from DNASTAR 

ctldln rT' WI) ' DaSh6r STADEN < avail ^e from Roger 
Staden, Cambridge, England). 

Next, with reference to Fig. 2, we describe in more 
detail the "abundance sort" program which implements above- 
mentioned -step (b) - to tabulate the number Qf • 

number" ,^ < the "abundance 

number" for each database entry). 

. , " 9 " 2 iS 3 fl ° W Chart ° f a Preferred embodiment of 
the abundance sort program, a source code listing of this 
embodiment of the abundance sort program is set forth in 

Iroaram'- ^ ^ " »^ion f the abundance sort 

program is written using the FoxBASE programming language 
commercially available from Microsoft Corporation 
Although FoxBASE was the program chosen for the first 
iteration of this technology, it should not be considered 
limiting. Many other programming languages, Sybase being a 
particularly desirable alternative, can also be used, as 
will be obvious to one with ordinary skill in the art. The 
subroutine names specified in Fig. 2 correspond to 
30 subroutines listed in Table 5. 

With reference again to Fig. 2, the "Identified 
Sequences" are transcript sequences representing each 
sequence of the library and a corresponding identification 
of the database entry (if any) which it matcnes . 
35 words, the "Identified Sequences" are transcript sequences 
representing the output of above-discussed "step (a) .. 

♦ h ilt 9 \ 3 " 3 bl ° Ck dia9ram ° f 8 SySteJn f ° r im P^nting 
the invention. The Fig. 3 system includes library 

generation unit 2 which generates a library and asserts an 
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output stream of transcript sequences indicative of the 
biological sequences comprising the library. Programmed 
processor 4 receives the data stream output from unit 2 and 
processes this data in accordance with above-discussed 
5 "step (a)" to generate the Identified Sequences. Processor 
4 can be a processor programmed with the commercially 
available computer program known as the INHERIT 670 
Sequence Analysis System and the commercially available 
computer program known as the Factura program (both 
!0 available from Applied Biosystems Inc.) and with the UNIX 
operating system. 

Still with reference to Fig. 3, the Identified 
Sequences are loaded into processor 6 which is programmed 
with the abundance sort program. Processor 6 generates the 
15 Final Transcript sequences indicated in both Figs. 2 and 3 
Fig. 4 shows a more detailed block diagram of a planned 
relational computer system, including various searching 
techniques which can be implemented, along with an 
assortment of databases to query against. 
20 With reference to Fig. 2, the abundance sort program 

first performs an operation known as "Tempnum" on the 
Identified Sequences, to discard all of the Identified 
Sequences except those which match database entries of 
selected types. For example, the Tempnum process can 
25 select Identified Sequences which represent matches of the 
following types with database entries (see above for 
definition): "exact- matches, human ••homologous" matches 
"other species- matches representing genes present in 
species other than human) , "no" matches (no significant 
30 regions of homology with database entries representing 
previously identified nucleotide sequences) , »l« matches 
(incyte for not previously known DNA sequences) , or "X" 
matches (matches ESTs in reference database) . This 
eliminates the U, S, M, V, A, R and D sequence (see Table 1 
35 for definitions) . 

The identified sequence values selected during the 
"Tempnum- process then undergo a further selection (weeding 
out) operation known as "Tempred." This operation can for 
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• example, discard all identified sequence values 

representing matches with selected database entries. 

The identified sequence values selected during the 
"Tempred" process are then classified according to library 
5 during the "Tempdesig" operation. It is contemplated that' 
the "Identified Sequences" can represent sequences from a 
single library, or from two or more libraries. 

Consider first the case that the identified sequence 
values represent sequences from a single library, m this 
case, all the identified sequence values determined during 
••Tempred" undergo sorting in the "Templib" operation 
further sorting in the "Libsort" operation, and finaUy 
additional sorting in the "Temptarsort" operation. For 
example, these three sorting operations can sort the 
15 identified sequences in order of decreasing "abundance 
number- (to generate a list of decreasing abundance 
numbers, each abundance number corresponding to a unique 
identified sequence entry, or several lists of decreasing 
abundance numbers, with the abundance numbers in each list 
corresponding to database entries of a selected type) with 
redundancies eliminated from each sorted list, m this 
case, the operation identified as "Cruncher" can be 
bypassed, so that the "Final Data" values are the organized 
transcript sequences produced during the "Tempt arsort" 
25 operation. 

We next consider the case that the transcript 
sequences produced during the "Tempred" operation represent 
sequences from two libraries (which we will denote the 
"target- library and the "subtractant" library) For 
example, the target library n>ay consist of cDNA sequences 
from clones of a diseased cell, while the subtractant 
library may consist of cDNA sequences from clones of the 
diseased cell after treatment by exposure to a drug. For 
another example, the target library may consist of cDNA 
sequences from clones of a cell type from a young human, 
while the subtractant library may consist of cDNA sequences 
from clones of the same cell type from the same human at 
different ages. 
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In this case, the "Tempdesig" operation routes ail 
transcript sequences representing the target library for 
processing in accordance with "Templib" (and then "Libsorf 
and "Temptarsorf), and routes all transcript sequences 
5 representing the subtractant library for processing in 
accordance with "Tempsub" (and then "Subsort" and 
"Tempsubsorf) . For example, the consecutive "Templib," 
"Libsort," and "Temptarsorf sorting operations sort 
identified sequences from the target library in order of 
10 decreasing abundance number (to generate a list of 
decreasing abundance numbers, each abundance number 
corresponding to a database entry, or several lists of 
decreasing abundance numbers, with the abundance numbers in 
each list corresponding to database entries of a selected 
15 type) with redundancies eliminated from each sorted list 
•The consecutive "Tempsub, " "Subsort," and "Tempsubsorf 
sorting operations sort identified sequences from the 
subtractant library in order of decreasing abundance number 
(to generate a list of decreasing abundance numbers, each 
20 abundance number corresponding to a database entry or 
several lists of decreasing abundance numbers, with the 
abundance numbers in each list corresponding to database 
entries of a selected type) with redundancies eliminated 
from each sorted list. 

25 The transcript sequences output from the "Temptarsorf 

operation typically represent sorted lists from which a 
histogram could be generated in which position along one 
(e.g., horizontal) axis indicates abundance number (of 
target library sequences) , and position along another 
30 (e.g., vertical) axis indicates identified sequence value 
(e.g., human or non-human gene type). Similarly, the 
transcript sequences output from the "Tempsubsorf 
operation typically represent sorted lists from which a 
histogram could be generated in which position along one 
35 (e.g., horizontal) axis indicates abundance number (of 

subtractant library sequences) , and position along another 
(e.g., vertical) axis indicates identified sequence value 
(e.g., human or non-human gene type). 
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The transcript seguences (sorted lists) output from 
the Tempsubsort and Temptarsort sorting operati ns are 
combined during the operation identified as "Cruncher." 
The "Cruncher" process identifies pairs of corresponding 
5 target and subtractant abundance numbers (both representing 
the same identified sequence value) , and divides one by the 
other to generate a "ratio" value for each pair of 
corresponding abundance numbers, and then sorts the ratio 
values in order of decreasing ratio value. The data output 
10 from the "Cruncher" operation (the Final Transcript 

sequence in Fig. 2) is typically a sorted list from which a 
histogram could be generated in which position along one 
axis indicates the size of a ratio of abundance numbers 
(for corresponding identified sequence values from target 
15 and subtractant libraries) and position along another axis 
indicates identified sequence value (e.g., gene type). 

Preferably, prior to obtaining a ratio between the two 
library abundance values, the Cruncher operation also 
divides each ratio value by the total number of sequences 
20 in one or both of the target and subtractant libraries. 

The resulting lists of -relative'' ratio values generated by 
the Cruncher operation are useful for many medical, 
scientific, and industrial applications. Also preferably, 
the output of the Cruncher operation is a set of lists, 
25 each list representing a sequence of decreasing ratio 
values for a different selected subset (e.g. protein 
family) of database entries. 

In one example, the abundance sort program of the 
invention tabulates for a library the numbers of mRNA 
30 transcripts corresponding to each gene identified in a 

database. These numbers are divided by the total number of 
clones sampled. The results of the division reflect the 
relative abundance of the mRNA transcripts in the cell type 
or tissue from which they were obtained. Obtaining this 
35 final data set is referred to herein as "gene transcript 
image analysis." The resulting subtracted data show 
exactly what proteins and genes are upregulated and 
downregulated in highly detailed complexity. 
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6.6. H0VEC cDNA LIBRARY — 
Table 2 is an abundance table listing the various gene 
transcripts in an induced HUVEC library. The transcripts 
are listed in order of decreasing abundance. This 
5 computerized sorting simplifies analysis of the tissue and 
speeds identification of significant new proteins which are 
specific to this cell type. This type of endothelial cell 
lines tissues of the cardiovascular system, and the more 
that is known about its composition, particularly in 
10 response to activation, the more choices of protein targets 
become available to affect in treating disorders of this 
tissue, such as the highly prevalent atherosclerosis. 

6.7. MONOCYTE* CELL AND MAST-CELL cDNA LIBRARIES 

Tables 3 and A show truncated comparisons of two 
15 libraries. in Tables 3 and 4 the "normal monocytes" are 
the HMC-1 cells, and the "activated macrophages" are the 
THP-1 cells pretreated with PMA and activated with LPS. 
Table 3 lists in descending order of abundance the most 
abundant gene transcripts for both cell types. with only 
20 15 gene transcripts from each cell type, this table permits 
quick, qualitative comparison of the most common 
transcripts. This abundance sort, with its convenient 
side-by-side display, provides an immediately useful 
research tool. In this example, this research tool 
25 discloses that 1) only one of the top 15 activated 
macrophage transcripts is found in the top 15 normal 
monocyte gene transcripts (poly A binding protein); and 2) 
a new gene transcript (previously unreported in other 
databases) is relatively highly represented in activated 
30 macrophages but is not similarly prominent in normal 

macrophages. Such a research tool provides researchers 
with a short-cut to new proteins, such as receptors, cell- 
surface and intracellular signalling molecules, which can 
serve as drug targets in commercial drug screening 
35 programs. Such a tool could save considerable time over 
that consumed by a hit and miss discovery program aimed at 
identifying important proteins in and around cells, because 
those proteins carrying out everyday cellular functions and 
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represented as steady state mRNA are quickly eliminated 
from further characterization. 

This illustrates how the gene transcript profiles 
change with altered cellular function. Those skilled in 
5 the art know that the biochemical composition of cells also 
changes with other functional changes such as cancer, 
including cancer's various stages, and exposure to 
toxicity. A gene transcript subtraction profile such as in 
Table 3 is useful as a first screening tool for such gene 
10 expression and protein studies. 

6.8. SUBTRACTION ANALYSIS OF NORMAL MONOCYTE- CELL im 
ACTIVATED MONOCYTP rrr.T. „ PMX ltbbibtpc ° ELL *** 

Once the cDNA data are in the computer, the computer 
program as disclosed in Table 5 was used to obtain ratios 
15 of all the gene transcripts in the two libraries discussed 
m Example 6.7, and the gene transcripts were sorted by the 
descending values of their ratios. If a gene transcript is 
not represented in one library, that gene transcript's 
abundance is unknown but appears to be less than l. As an 
20 approximation - and to obtain a ratio, which would not be 
possible if the unrepresented gene were given an abundance 
of zero ~ genes which are represented in only one of the 
two libraries are assigned an abundance of 1/2. Using 1/2 
for unrepresented clones increases the relative importance 
25 of «turned-on» and »turned-of f " genes, whose products would 
be drug candidates. The resulting print-out is called a 
subtraction table and is an extremely valuable screening 
method, as is shown by the following data. 

Table 4 is a subtraction table, in which the normal 
30 monocyte library was electronically "subtracted- from the 
activated macrophage library. This table highlights most 
effectively the changes in abundance of the gene 
transcripts by activation of macrophages. Even among the 
first 20 gene transcripts listed, there are several unknown 
35 gene transcripts. Thus, electronic subtraction is a useful 
tool with which to assist researchers in identifying much 
more quickly the basic biochemical changes between two cell 
types. Such a tool can save universities and 
. pharmaceutical companies which spend billions of dollars on 
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research valuable time and laboratory resources at the 
early discovery stage and can speed up the drug development 
cycle, which in turn permits researchers to set up drug 
screening programs much earlier. Thus, this research tool 
5 provides a way to get new drugs to the public faster and 
more economically. 

Also, such a subtraction table can be obtained for 
patient diagnosis. An individual patient sample (such as 
monocytes obtained from a biopsy or blood sample) can be 
10 compared with data provided herein to diagnose conditions 
associated with macrophage activation. 

Table A uncovered many new gene transcripts (labeled 
Incyte clones) . Note that many genes are turned on in the 
activated macrophage (i.e., the monocyte had a 0 in the 
15 bgfreg column) . This screening method is superior to other 
screening technigues, such as the western blot, which are 
incapable of uncovering such a multitude of discrete new 
gene transcripts. 

The subtraction-screening technigue has also uncovered 
20 a high number of cancer gene transcripts (oncogenes rho, 
ETS2, rab-2 ras, YPTl-related , an d acute myeloid leukemia 
mRNA) in the activated macrophage. These transcripts may 
be attributed to the use of immortalized cell lines and are 
inherently interesting for that reason. This screening 
25 technigue offers a detailed picture of upregulated 

transcripts including oncogenes, which helps explain why 
anti-cancer drugs interfere with the patient's immunity 
mediated by activated macrophages. Armed with knowledge 
gained from this screening method, those skilled in the art 
30 can set up more targeted, more effective drug screening 
programs to identify drugs which are differentially 
effective against 1) both relevant cancers and activated 
macrophage conditions with the same gene transcript 
profile; 2) cancer alone; and 3) activated macrophage 
35 conditions. 

Smooth muscle senescent protein (22 kd) was 
upregulated in the activated macrophage, which indicates 
that it is a candidate to block in controlling 
inflammation. 
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6.9. SUBTRACTION ANALYSIS OF NORMAL LIVER CELLS AND 
HEPATITIS INFECTED LIVER CELL cDNA LIBRARIES 

In this example, rats are exposed to hepatitis virus ' 
and maintained in the colony until they show definite signs 
5 of hepatitis, of the rats diagnosed with hepatitis, one 
half of the rats are treated with a new anti-hepatitis 
agent (AHA) • Liver samples are obtained from all rats 
before exposure to the hepatitis virus and at the end of 
AHA treatment or no treatment. In addition, liver samples 
10 can be obtained from rats with hepatitis just prior to AHA 
treatment. 

The liver tissue is treated as described in Examples 
6.2 and 6.3 to obtain mRNA and subsequently to sequence 
cDNA. The cDNA from each sample are processed and analyzed 

15 for abundance according to the computer program in Table 5. 
The resulting gene transcript images of the cDNA provide 
detailed pictures of the baseline (control) for each animal 
and of the infected and/or treated state of the animals. 
cDNA data for a group of samples can be combined into a 

20 group summary gene transcript profile for all control 
samples, all samples from infected rats and all samples 
from AHA-treated rats. 

Subtractions are performed between appropriate 
individual libraries and the grouped libraries. For 

25 individual animals, control and post-study samples can be 
subtracted. Also, if samples are obtained before and after 
AHA treatment, that data from individual animals and 
treatment groups can be subtracted. In addition, the data 
for all control samples can be pooled and averaged. The 

30 control average can be subtracted from averages of both 
post-study AHA and post-study non-AHA cDNA samples. If 
pre- and post-treatment samples are available, pre- and 
post-treatment samples can be compared individually (or 
electronically averaged) and subtracted. 

35 These subtraction tables are used in two general ways. 

First, the differences are analyzed for gene transcripts 
which are associated with continuing hepatic deterioration 
or healing. The subtraction tables are tools to isolate 
the effects of the drug treatment from the underlying basic 

40 pathology of hepatitis. Because hepatitis affects many 
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parameters, additional liver toxicity has been difficult to 
detect with only blood tests for the usual enzymes. The 
gene transcript profile and subtraction provides a much 
more complex biochemical picture which researchers have 
5 needed to analyze such difficult problems. 

Second, the subtraction tables provide a tool for 
identifying clinical markers, individual proteins or other 
biochemical determinants which are used to predict and/or 
evaluate a clinical endpoint, such as disease, improvement 
10 due to the drug, and even additional pathology due to the 
drug. The subtraction tables specifically highlight genes 
which are turned on or off. Thus, the subtraction tables 
provide a first screen for a set of gene transcript 
candidates for use as clinical markers. Subsequently 
15 electronic subtractions of additional cell and tissue' 

libraries reveal which of the potential markers are in fact 
found in different cell and tissue libraries. Candidate 
gene transcripts found in additional libraries are removed 
from the set of potential clinical markers. Then, tests of 
20 blood or other relevant samples which are known to lack and 
have the relevant condition are compared to validate the 
selection of the clinical marker. In this method the 
particular physiologic function of the protein transcript 
need not be determined to qualify the gene transcript as a 
25 clinical marker. 
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6 - 10 ' ELECTRON IC NORTHERN BLOT 
One limitation of electronic subtraction is that it is 
difficult to compare more than a pair of images at once 
Once particular individual gene products are identified as 
relevant to further study (via electronic subtraction or 
other methods) , it is useful to study the expression of 
single genes in a multitude of different tissues. m the 
lab, the technique of "Northern" blot hybridization is used 
for this purpose. m this technique, a single cDNA, or a 
35 probe corresponding thereto, is labeled and then hybridized 
against a blot containing RNA samples prepared from a 
multitude of tissues or cell types. Upon autoradiography 
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the pattern of expression of that particular gene, one at a 
time, can be quantitated in all the included samples. 

In contrast, a further embodiment of this invention is 
the computerized form of this process, termed here 
5 "electronic northern blot." m this variation, a single 
gene is queried for expression against a multitude of 
prepared and sequenced libraries present within the 
database, m this way, the pattern of expression of any 
single candidate gene can be examined instantaneously and 
10 effortlessly. More candidate genes can thus be scanned, 
leading to more frequent and fruitfully relevant 
discoveries. The computer program included as Table 5 
includes a program for performing this function, and Table 
6 is a partial listing of entries of the database used in 
15 the electronic northern blot analysis. 

6 * 11 ' PHASE 1 CLINICAT. TRIALS 
Based on the establishment of safety and effectiveness 
an the above animal tests, Phase I clinical tests are 
undertaken. Normal patients are subjected to the usual 
20 preliminary clinical laboratory tests. m addition, 
appropriate specimens are taken and subjected to gene 
transcript analysis. Additional patient specimens are 
taken at predetermined intervals during the test. The 
specimens are subjected to gene transcript analysis as 
25 described above. in addition, the gene transcript changes 
noted in the earlier rat toxicity study are carefully 
evaluated as clinical markers in the followed patients. 
Changes in the gene transcript analyses are evaluated as 
indicators of toxicity by correlation with clinical signs 
30 and symptoms and other laboratory results, m addition, 
subtraction is performed on individual patient specimens 
and on averaged patient specimens. The subtraction 
analysis highlights any toxicological changes in the 
treated patients. This is a highly refined determinant of 
35 toxicity. The subtraction method also annotates clinical 
. markers. Further subgroups can be analyzed by subtraction 
analysis, including, for example, i) segregation by 
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occurrence and type of adverse effect; and 2) "segregation 
by dosage. 

6 * 12, GENE TRANSCRIPT IMlSTW j ANALYSTS TK CLTNTC^. By nnrbe 

A gene transcript imaging analysis (or multiple gene 
5 transcript imaging analyses) is a useful tool in other 
clinical studies. For example, the differences in gene 
transcript imaging analyses before and after treatment can 
be assessed for patients on placebo and drug treatment. 
This method also effectively screens for clinical markers 
to follow in clinical use of the drug. 
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The subtraction method can be used to screen cDNA 
libraries from diverse sources. For example, the same cell 
types from different species can be compared by gene 
15 transcript analysis to screen for specific differences, 
such as in detoxification enzyme systems, such testing 
aids in the selection and validation of an animal model for 
the commercial purpose of drug screening or toxicological 
testing of drugs intended for human or animal use. when 
the comparison between animals of different species is 
shown in columns for each species, we refer to this as an 
. interspecies comparison, or zoo blot. 

Embodiments of this invention may employ databases 
such as those written using the FoxBASE programming 
25 language commercially available from Microsoft Corporation 
Other embodiments of the invention employ other databases 
such as a random peptide database, a polymer database, a ' 
synthetic oligomer database, or a oligonucleotide database 
of the type described in U.S. Patent 5,270,170, issued 
30 December 14, 1993 to Cull, et al., PCT International 

Application Publication No. wo 9322684, published November 
11, 1993, PCT International Application Publication No wo 
9306121, published April i, 1993, or PCT International 
Application Publication No. wo 9119818, published December 
35 26, 1991. These four references (whose text is 

incorporated herein by reference) include teaching which 
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may- be applied in implementing such other embodiments of 
the present invention. — 

All references referred to in the preceding text are 
hereby expressly incorporated by reference herein. 
5 Various modifications and variations of the described 

method and system of the invention will be apparent to 
those skilled in the art without departing from the scope 
and spirit of the invention. Although the invention has 
been described in connection with specific preferred ' 
10 embodiments, it should be understood that the invention as 
claimed should not be unduly limited to such specific 
embodiments. 
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SUBSTITUTE SHEET (RULE 26) 
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TABLE 2 



Clone number© 15000 through 20000 

Libraries: HUVEC 

Arranged by ABUNDANCE 

Total clones analyzed: 5000 

319 genes, for a total of 1713 cic 



number 



N 



1 


15365 


67 


2 


15004 


65 


3 


15638 


63 


4 


15390 


50 


5 


15193 


47 


6 


15220 


47 


7 


15280 


47 


8 


15583 


33 


9 


15662 


31 


10 


15026 


29 


11 


15279 


24 


12 


15027 


23 


13 


15033 


20 


14 


15198 


20 


15 


15809 


20 


16 


15221 


19 


17 


15263 


19 


18 


15290 


19 


19 


15350 


18 


20 


15030 


17 


21 


15234 


1 7 


22 


1 545? 


16 


23 


1 


15 


OA 


1 3 J / O 


1 5 




1 C 2 C £ 


1 A 


26 


J. 9 •* 


T A 


27 


15425 


14 


28 


18212 


14 


29 


18216 


14 


30 


15189 


13 


31 


15031 


12 


32 


15306 


12 


33 


15621 


12 


34 


15789 


11 


35 


16578 


11 


36 


16632 


11 


37 


18314 


11 


38 


15367 


10 


39 


15415 


10 


40 


15633 


10 


41 


15813 


10 


42 


18210 


10 


43 


18233 


10 


44 


18996 


10 


45 


15088 


9 


46 


15714 


9 


47 


15720 


9 


48 


15863 


9 


49 


16121 


9 


50 


18252 


9 


51 


15351 


8 


52 


15370 


8 



entry 

HSRPL41 

NCY015004 

NCY015638 

NCY015390 

HSFIB1 

RRRPL9 

NCY015280 

M62060 

HSACTCGR 

NCY015026 

HSEF1AR 

NCY015027 

NCY015033 

NCY015198 

HSCOLL1 

NCY015221 

NCY015263 

NCY015290 

NCY015350 

NCY015030 

NCY015234 

NCY015459 

NCY015353 

S76965 

HUMTHYB4 

HSLIPCR 

H5POLYAB 

HUMTHYMA 

HSMRP1 

HS18D 

HUMFKBP 

HSK2AZ 

HUMLEC 

NCY015789 

HSRPS11 

M61984 

NCY018314 

NCY015367 

HSIFN1N1 

HSLDHAR 

CHKNMHCB 

NCY018210 

HSRPIH40 

NCY018996 

HUMFERL 

NCY015714 

NCY015720 

NCY015863 

HSET 

NCY018252 

HUMALBP 

NCY015370 



descriptor 

Riboptn L41 

INCYTE 015004 

INCYTE 015638 

INCYTE 015390 

Fibronectin 

Riboptn L9 

INCYTE 015280 

EST HHCH09 ( IGR) 

Actin, gamma 

INCYTE ^015026 

Elf 1-alpha 

INCYTE 015027 

INCYTE 015033 

INCYTE 015198 

Collaoenase 

INCYTE 015221 

INCYTE 015263 

INCYTE 015290 

INCYTE 015350 

INCYTE 015030 

INCYTE 015234 

INCYTE 015459 

INCYTE 015353 

Ptn kinase inhib 

Thymosin beta-4 

Lipocortin I 

Poly-A bp 

Thymosin, alpha 

Motility relat ptn; MRP-l;CD-9 

Interferon indue ptn 1-8D 

FK506 bp 

Histone H2A 

Lectin, B-galbp, 14kDa 

INCYTE 015789 

Riboptn Sll 

EST HHCA13 (IGR) 

INCYTE 018314 

INCYTE 015367 

interferon indue mRNA 

Lactate dehydrogenase 

C Myosin heavy chain B 

INCYTE 018210 

RNA polymerase II 

INCYTE 018996 

Ferritin, light chain 

INCYTE 015714 

INCYTE 015720 

INCYTE 015863 

Endothelin 

INCYTE 018252 

Lipid bp, adipocyte 

INCYTE 015370 
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TABLE ? Cnn't 





number 


N 


c entry 


53 


15670 


8 


BTCIASHI 


54 


15795 


8 


NCY015795 


55 


16245 


8 


NCY016245 


56 


18262 


6 




57 


18321 


6 


HSRPT 1*7 


58 


15126 




XT PPT 1 P13P 


59 


15133 






&n 

OU 


1524 5 




H V A 1 C 1 X C 


wi 


15288 


- 




o« 






H5GAPDR 


CJ 


1 £44 2 




LIT IliT » UB 


04 


1 c/pt 




HSNGMRNA 


OS 


iDD*iv 




NCY016646 


bb 


i ouuj 




HUMPAIA 


67 


J 50 J-t 


b 


HUMUB 


68 


^ C *1 £ *7 

152o / 


o 


HSRPS8 


69 


15295 


6 


NCY015295 


70 


15458 


6 


RNRPS10R 


71 


15832 


6 


RSGALEM 


72 


15928 


6 


HUMAPOJ 


73 


16598 


6 


HUMTBBM40 


74 


18218 


6 


NCY018218 


75 


18499 


6 


HSP27 


76 


18963 


6 


NCY018963 


77 


18997 


6 


NCY018997 


78 


15432 


5 


HSAGALAR 


79 


15475 


5 


NCY015475 


80 


15721 


5 


NCY015721 


81 


15865 


5 


NCY015865 


82 


1 627U 


c 


NCY016270 


OJ 


J DODO 


c 


NCY01 6886 


Q.A 

o«i 


J. O 3 ww 


c 

9 


NCY018500 


85 


18503 


5 


«U. J U J OSUJ 


86 


19672 


5 


RRRPL34 


87 


15086 


4 


XLRPL1AR 


88 


15113 


4 


HUMIFNWRS 


89 


15242 


4 


NCY015242 


90 


15249 


4 


NCY015249 


91 


15377 


4 


NCY015377 


92 


15407 


4 


NCY015407 


93 


15473 


4 


NCY015473 


94 


15588 


4 


HSRPS12 


95 


15684 


4 


H5EF1G 


96 


15782 


4 


NCY015782 


97 


15916 


4 


HSRPS18 


98 


15930 


4 


NCY015930 


99 


16108 


4 


NCY016108 


100 


16133 


4 


NCY016133 



5 
V 



R 
R 



R 
F 



descriptor 

NADH-ubig oxidoreductas© 

INCYTE 015795 

INCYTE 016245 

INCYTE 018262 

Riboptn LI 7 

Riboptn LI 

Act in f beta 

INCYTE 015245 

INCYTE 015288 

G-3-PD 

Laminin receptor, 54kDa 
Uracil DNA glycosylaae 
INCYTE 016646 
Plsmnogen activ gene 
Ubiguitin 
Riboptn SB 
INCYTE 015295 
Riboptn 510 

UDP-galactose epimeraee 

Apolipoptn J 

Tubulin, beta 

INCYTE 018218 

Hydrophobic ptn p27 

INCYTE 018963 

INCYTE 018997 

Galactosidase A, alpha 

INCYTE 015475 

INCYTE 015721 

INCYTE 015865 

INCYTE 016270 

INCYTE 016886 

INCYTE 018500 

INCYTE 018503 

Riboptn L34 

Riboptn Lla 

tRNA synthetase, trp 

INCYTE 015242 

INCYTE 015249 

INCYTE 015377 

INCYTE 015407 

INCYTE 015473 

Riboptn S12 

Elf 1 -gamma 

INCYTE 015782 

Riboptn S18 

INCYTE 015930 

INCYTE 016108 

INCYTE 016133 
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TABLE 4 



Libraries: THP-1 
Subtracting: HMC 
Sorted by ABUNDANCE 
Total clones analyzed: 



7375 



1057 genes, for a total of 2151 clones 
number entry £ descriptor 



10022 

10036 

10069 

10060 

10003 

10689 

11050 

10937 

10176 

10886 

10186 

10967 

11353 

10298 

10215 

10276 

10488 

11138 

10037 

10840 

10672 

12837 

10001 

10005 

10294 

10297 

10403 

10699 

10966 

12092 

12549 

10691 

12106 

10194 

10479 

10031 

10203 

10288 

10372 

10471 

10484 

10859 

10890 

11511 

11868 

12820 

10133 

10516 

11063 

11140 

10788 

10033 

10035 

10084 

10236 

10383 



HUMIL1 
HSMDNCF 
H SLAG ICON 
HUMTCSM 
HUHMIP1A 
HSOP 

NCY011050 
HSTNFR 
HSSOD 
H5CDW40 
HUMAPR 
HUMCDN 
NCY011353 
NCY010296 
HUM 4 COLA 
NCY010276 
NCY010488 
NCY011138 
HUMCAPPRO 
HUMADCY 
HSCD44E 
HUMCYCLOX 
NCY010001 
NCY010005 
NCY010294 
NCY010297 
NCY010403 
NCY010699 
NCY010966 
NCY012092 
HSRHOB 
HUMARF1BA 
HSADSS 
HSCATHL 
CLMCYCA I 
NCY010031 
NCY010203 
NCY010288 
NCY010372 
NCY010471 
NCY010484 
NCY010859 
NCY010890 
NCY011511 
NCY011868 
NCY012820 
HSI1RAP 
HUMP2A 
HUMB94 
H5HB15RNA 
NCY001713 
NCYO1O033 
NCY010035 
NCY010084 
NCY010236 
NCY010383 



IL 1-beta 
IL-8 

Lymphocyte activ aene 
RANTES 
MIP-1 

Osteopontin 
INCYTE 011050 
TNF-alpha 

Superoxide dismutase 
B-cell activ,NGF-relat 
Early resp PMA-induc 
PN-1, glial-deriv 
INCYTE 011353 
INCYTE 010298 
Collagenase, type IV 
INCYTE 010276 
INCYTE 01048B 
INCYTE 011138 
Adenylate cyclase 
Adenylate cyclase 
Cell adhesion glptn 
Cyclooxyaenase-2 
INCYTE 010001 
INCYTE 010005 
INCYTE 010294 
INCYTE 010297 
INCYTE 010403 
INCYTE 010699 
INCYTE 010966 
INCYTE 012092 
Oncogene rho 
ADP-ribosylation fctr 
Adenylosuccinate synthetase 
Cathepsin L 
' Cyclin A 

INCYTE 010031 
INCYTE 010203 
INCYTE 010288 
INCYTE 010372 
INCYTE 010471 
INCYTE 010484 
INCYTE 010859 
INCYTE 010890 
INCYTE 011511 
INCYTE 011868 
INCYTE 012820 
IL-1 antagonist 
Phosphatase, regul 2A 
TNF-induc response 
HB15 gene; new Ig 
INCYTE 001713 
INCYTE 010033 
INCYTE 010035 
INCYTE 010084 
INCYTE 010236 
INCYTE 010383 



bgfreq rfend ratio 



0 


131 


262.00 


0 


119 


238.00 


0 


71 


142.00 


0 


23 


46.000 


3 


121 


40.333 


0 


20 


40.000 


0 


17 


34 .000 


0 


17 


34 .000 


0 


14 


28.000 


0 


10 


20.000 


0 


9 


18. 000 


0 


9 


18. 000 


0 


8 


16.000 


0 


7 


14 .000 


0 


6 


12 . 000 


0 


6 


12. 000 


0 


6 


12.000 


0 


6 


12.000 


1 


10 


10.000 


0 


5 


10.000 


0 


5 


10.000 


0 


5 


10.000 


0 


5 


10.000 


0 


5 


10.000 


0 


5 


10.000 


0 


5 


10.000 


0 


5 


10.000 


0 


5 


10.000 


u 


5 


10.000 


n 
w 


5 


10.000 


n 
\j 


5 


10.000 


o 




8 . 000 


o 




8 . 000 


0 




8.000 


0 




8.000 


0 




8.000 


0 




8.000 


0 




8.000 


0 




8.000 


0 




8.000 


0 




8.000 


0 




8.000 


0 




8.000 


0 




8.000 


0 




8.000 


0 




8.000 


0 




8.000 


0 




8.000 


0 




8.000 


0 


3 


6.000 


0 


3 


6.000 


0 


3 


6.000 


0 


3 


6.000 


0 


3 


6.000 


0 


3 


6.000 


0 


3 


6.000 
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TABLE 4 Con'* 



number 



entry 



s descriptor 



bgfreq rfend ratio 



10450 


NCY010450 


10470 


NCY010470 


10504 


NCY010504 


10507 


KCY010507 


10598 


NCY0105QB 


1 0779 


NCY01 O770 


1 OQOQ 

4 U7U7 




1 OQ7fi 




J U703 




A A US* 


NCY01 1 fi^9 


11066 


NCY011068 


11134 


NCY011134 


11136 


NCY0U136 


11191 


NCY011191 


11219 


NCY011219 


11386 


NCY011386 


11403 


NCY011403 


11460 


NCY011460 


11618 


NCY011618 


11666 


NCY011686 


12021 


NCY012021 


12025 


NCY012025 


12320 


NCY012320 


12330 


NCY012330 


12853 


NCY012853 


14386 


NCY014386 


14391 


NCY014391 



INCYTE 

INCYTE 

INCYTE 

INCYTE 

INCYTE 

INCYTE 

INCYTE 

INCYTE 

INCYTE 

INCYTE 

INCYTE 

INCYTE 

INCYTE 

INCYTE 

INCYTE 

INCYTE 

INCYTE 

INCYTE 

INCYTE 

INCYTE 

INCYTE 

INCYTE 

INCYTE 

INCYTE 

INCYTE 

INCYTE 

INCYTE 



010450 

010470 

010504 

010507 

010598 

010779 

010909 

010976 

010985 

011052 

011068 

011134 

011136 

011191 

011219 

011386 

011403 

011460 

011618 

011686 

012021 

012025 

012320 

012330 

012853 

014366 

014391 



0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 



3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
3 



6.000 
6.000 
6.000 
6.000 
6.000 
6.000 
6.000 
6.000 
6.000 
6.000 
6*000 
6.000 
6.000 
6.000 
6.000 
6.000 
6.000 
6.000 
6.000 
6.000 
6.000 
6.000 
6.000 
6.000 
6.000 
6.000 
6.000 
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TABLE 5' 



• Master xssnu icr SU3TRACT2CN output 
SET T UX O FF 
SET SAFETY OFT 
STT DCACT Of 

err wtewsad to o 

OAK ' 

SET DEVICE TO SCREEN 

USB- , SwtGiy:FcwBASE+/Wac:fex f ilea: Clones. dbf* 
CO TOP 

STORE NUM BER TO INITIAJT 
CO BOTTOM 

STORE NUMBS* TO *TCRHUtt3T 
STORE 1 TO Tfcrpetl 

STOR E ' * TO Tauv«t3 

STORE 1 -'TO Targct3 

fc'TORE . ' ■ to Objectl 

MURE 1 ' 1 TO Qbject2 

STORE ' • TO Object 3 

STORE 0 TO AWa ■ 
STORE 0 TO DitfOi 
STORE 0 TO HMATOH 
STORE 0 TO OftTOi 
STORE 0 TO 2KATCH 
STORE 0 TO J7T 
BTOHE 1 TO BAH/ 
SO WKH£ .T. ' 

* Trogrea. i 'Subtraction S.fet 

•Versiea.i Fcx£A£E*/Kac ( revision' 1,10 

• Notes....: Fernet fil« Subtracfica 2 



SCKEDf 1 TfFZ C HD£32>C 'Screen l f AT 40.2 fiTlE 3fifi 4 03 tjtvtt c ~« -~ 

fl TTXZLS 75,120 TO 178,241 STtti 3671 COL»To 3 aittO^lM*?^ >G «^ , '5 COLOR 0,0,0, 

6 rXXELS 27,«4 SAY -Subtraction M«m- fi^ ^ -^213 « * , , 

G FIXn-S. 117,126 GET DiATCK STrl/l* 536 HOT '&iL^io ^^^S.?^ °' 0 '-lf*.^.-l 

c-fxxels assise get ItvS lisll ESS -SiS£-'*2 SIS * «ze isiea co 

« IIXELS 153,126 GET GMXTCH rm£ £5536 FWT mSS JJf^ 0 ^ 0 ^ 011 * SIZE J£,l 

€ ri5CE2*B 171,126 GET Imatefc STOLE £5536 FONT 'Chir-oot n A rTri-Se ZL ' *' 1 

;.gs sre s.3 ssi 2i?j& «... «, 

( T»EL6 108,50 GET tor9«tl STOLE 0 TCMT ■ Geneve- 9 eii* v> ?o Snim « i. . 1 . 
■C ISELS 135.30 SET tax|«tS STOLE 0 KNT '» £I2E o*?! £H£ S'S' 'J« 

« FIXELS 163.30 GET tJr|et3 STOLE 0 POTT >oS£M "|f J|'2! S'fi'"}' ' 

« TDEL6 135. 3S9 GET cbjeet3 STOLE 0 TOTT •Geneva' S SI2E 12 7B rot™ n E' , T «' i 
6 TDE£ 162.3SS GET ob ect3 STOLE o" -^e-J 11% a'll S ' S' "l ' "i ' 



* EOFs Subtraction. 2. fat 
REM) • 

3F Bail«2 

CLEAR 

CLOSE DATABASES 

USE , SwrtGuy J FoxSAS£4/Nac:iox filee :clonee .dbf 
.SE T Stf TTT ON 
6CRED4.1 OFF 
RETORT 
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— 

&TCto£ V*L(S YS(3)) TC 
STORE XJPmiTtTgetX) TO Ttnjetl 
STORE. CTPER tT&rgctS) TO Tarpet2 
STORE UTHR (t&rpcb3) TO Ttoet3 
STORE UPPER (Object!) TO Qbjectl ' 
STORE UPPER (Objects) TO* Object 2 
STORE W«R(0bject3J TO Object! 
clear 

SET TALX ON 

GAP e TERWlM^TE-mmATRfl 
GO 2JOTXA3E 

COPT KDTT GAP rrrr.pc wriwapp -l iv..^,, * . fc _ 

COUNT TO TOT w 

£p^ re^Ki *" tch '° .AND. a»oM 

USE ' 

COPY STKDCTORE TO TDiFDESIG 
USE TDi>J3ESIC 
IT Bnetch«l 

AFFIXE FROJ FOR D*»S» 

DJDIF 

IT*ttn»cch»l 

APPEND FRCW TEMrNUM FOR D^'H 1 
IP Crreatch»l 

APPEND FROt TDOWUM FOR Ite'O 1 
- BfDXF 
IT iMtGhsl 

B9ZXXF 

COUNT TO STARTOT 

COPy STRUCTURE TO TEWPUB 
•USE TOCliIB . . 

BjDjp • ^ R l-araxy«CF5BR (targets) 

IF t&rgct3<>* i 

fWM.TO'PBES* FOR Itorytm,, (tarpet3) 

COUNT TO ANALTOT 

USE TOgDESIG 

COPT STRUCTURE TO TO4PSUB 

USE TOiPSUB 

»»»»« «»m» FOR ilWy BW pr R <ob jM t l ) 
WTOJD TOi TZKPDESIG FOK. libx,^^,^^ 2) 
IP ttrc*t3<>' 

S° ^* "^=ES» FOR litear y=W «* ( c W .ct3) 

COUNT TO CUB1WLCT0T 
SET TMiX OFT 

* COMPRESSION SUBROUTINE A " *******•*••••••*••••••»•••••» 

? 'CQMPRESSW OUE?lV IIBRARvi *V 
TOE TEXFLIB 
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PORT Cfr TPJT RY , NUMBER TO LXBSOKT 
USE I*IS5QRT 
COUNT TO IDGDJE 
REPLACE AU, KFDro WITH 1 
M\m ml 

DO XKZIf SW3»0 ROLL 
IF >r IDGENB 

PACK 

COUNT TO AUK10UE 

Sfflol 

LCQF 

EMDXF 

GO mabxi 
dup « a 

STORE EWW TO TEST* 
STORE V TO DESIGN . 
6W - .0 

DO SW-0 .TEST 

asp 

STORE ErmUT TO TESTS 
STORE D TO ESSIGB 

IT TESTA r TESTB.XND.DSSIGAnDESlGB 

DELETE 

cup * rup+i 

LOOP 

co'TOJua 

ruh-xct Hfa® tow cup 

KOTU » WARX1+DOP 

Sffcl 

LOOP 

Z3TO0.TEST 
LOOP 

INDDO BOLL 

FORT O? RFB^D/D , KUMB& TO TEKPTARSORT 
USE TS^FIMSOKT 

IJS*S J^ir^SSJ *™ WEND/2DCENEUOOO0 



♦ COMPRESSION SUBROUTINE £ 

? I C0NFBESSD« TARGET LIBRARY* 

UBE TEKPSUB 

SORT OK lOTTOf, NUMBER TO'SUBSORT 
USE SUBSO RT 
COCNT TO SUBuLne 
REPIACE ALL RTB© WITH 1 

Mwoa « 1 

BW3.C 

CO WHILE ROLL 
IP KAWQ SUBCODE 
PACK • 

COUKT 50 BUNIOUE 

SW2ftl 

LOOP • 

DDZF 
00 KARXl • 
DUP - 1 

STORE, ENJW TO TESTA 
STORE 0 TO DES2GA 
CW > 0 

DO KKIIZ FWeO TEST 
SKIP 

STORE WTRX TO. TESTS 
STORE D TO OESICB 
If TESTA * 1ESTE . AND. DESIGArDESIGB 



50 



WO 95/20681 

PCT/US95/01160 

CUP - — 
LOOP 
ENDXT 
CO KttXl 

RE7UCE WE© WITH CUP 
MARXl «= tftRXl+CUP 

LOOP 

WOP : 

SORT CN FF2HD/D.1IUMBER TO TEWPSUBSORT 
•USE TEMPSUESOKT 

oocwt ro tojpsueco 

•FUSICN ROUTINE •••»••♦♦•**•****•♦ •••••••♦♦*******»»»»»»»* w#t4 

? TJETRACTDJG LIBRARIES 1 
USE S UBTR ACTION 

copy rnuJCTURE to cruncher 

USE TEMPHUBSQRT 

TOB CKJNCHBt 
APPD© FRCK TOGTARSORT 
COUNT TO BAILOUT 
1AKK = 0 

CO ViHILB .T.. 
£Q£CT 1 
MARX ■ MARX+1 

If H ARX>&AILOOT 

EXIT 

-GO tARK 

STOPS" DTTFT TO SCANNER 
SELECT 2 

LOCATE. FOR OfIKy=SCWWER 
17 FDUNDO 
STORE RJZND TO BIT! 
STORE RFU3D TO BI32 
wj» . 

STORE 1/3 TO EIT1 
STORE 0 TO B3T2 
ENDIF 
SELECT 1 

RIF1ACE BGTR50 WITH BIT2 
REFXACE ACTUAL WITO BIT1 
LOOP 
S4C00 

SELECT X 



REPLACE ALL RATIO WITO RFZND/ACTOXL 
? 'DOING FINAL SORT BY RATIO' 

ffpS^ T2 ° /r ' BGrW/D ' DS ^^ TO PDftL 



00 CASE 

CASE PITeC" ' 

SET DEVICE TO PRINT 

BET FR BTT CK 

JL3ECT 

CASE PITsl 

6BT MOMMI TO 'Adenoid .fetcnt riBur.fiWbtnetipB.eicf 
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CTOKE V7^(W(2))'7D F7WT3ME 

if rr^T a, * L<£WTO tt 



•SET MAj^S* 1 * TO 10 

jl,l EAV ^brary Bubtricticn An^ie- cmE £££36 TU7T 'Geneva',:^ COLOR 0.0,0,-1, -1,-1 



7 

7 
? 

9 deteO 
77 » 

77 TO£E(i ^ . 
7 'Clop c su aJbert • 
• 77:«£ra ( 3TOT1ATE, 5,0) 
?? • thrcw^ 1 * ' 

7 3V0*tl 

IP Tarp* t3<>1 

77 T*rpet2 
ZNDZ7 . . 
IF Tarp**-*** 
7? ' ' 
77 Oferp* 13 . 

7 • subtracters* 
7 Object^ 
XF-Objec*^** 

77- \ . • , 

77 Objects 
BTOXF 

37 Qbject3«>' 
7? ' , 
77 0bject3 

IF Dwtch*0 .AND. Hmflteh=0 .AND. Crotch«0" .AND. D£VTCH=0 
?7 'All 1 

ocir _ 

IF Dnatch pJ 

DdF ' ' 
IF Km*tch=i 
7? 'H unan* ' 

£NULT 

•IF Onetcb»i 
?7 'Other «P*. ' 
DOSF 

IF Imatcb»l 
FJ.'IHCylS 1 

U ANA1*=1 l 

7 'Sort #4 *V ABUNDANCE 1 
IF 

7 'Arrant fcy FUNCTICN' 
SNDXF 
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7 'Total cl«»i r^x«ent*d: • ~ 

?? FTK<3CT,5iOJ . 

y i^ct*3 xlonet Ajjtlyied* 1 

77 ETP> (STARJET, 5,0) 

* »Totel. computation, time: 

77 i jtinutM 1 * 

j- .cj - teMpuM. t . di.txitaitiw t . leceticn. r « function . . «p^ it , j . ^ 

Rub^s m o mcM'iL at <e,s km 3 « |4 e 2 FOca6 nw 'GeW.j color 0,0,0, 

CAKE ' 

?? «*C*»«gUB f 4.0) 

77 i t««£. fcr « tcul ei 1 . 

>7 ■ c3cn«' 

.USE •£3i^CvV:rcxlAEE + />isc:Jox files cclcDes.dbf ' 

SET fjOWCN 
STT KE^ 13 ^ CN 

SO*** 1 W«,0 Sine -sceen J V » 40.2 SXtt 2II.4N mu row ^vetica- , 2£ E COLOR C 

"? ' BINDING nCTETNE* 

rKiicTS.'JgrSe'^-J;;? 3, <0,! 5121 3 " «" rtMr ■^i«.. ae5 color c 

spy ssj ^swasg^ 

■ T ^ri»SSiffSSJ2T?-' 1,, * ! r <0 - 3 «r raw .rem ..H el v et i tt ., a 65 color c 

saw as? K^^.s^ 0.0,0; 

52?&f ^ c ...... 

i°Si SSi^Sta^ *' » <0 ' 2 «" 3e£ '«" «»" «»" •H tl V, t i„.. 3K COLOR 0 

s?y --^^s^^ ....... 

SCp* a «« 0 HEADDC -Sex.** l^jW.IBE 8S6.I92 riXELS rem -Hectic* • .268 COLOR 0 

h§Si.rSci^? ' 5e "« »' * «•.» "2= 8M.« 3 ri«LS m T r .Hc 1 v.tie...56 8 color o' 

5CRHU 1 0. KEADtWG 'Screen !• AT .40 2 n-r 5RC ze? rTv-rre tvn»n • 

F'S.ySi'lSJS!, V <0 ' J ™ iSi ***** ™> •K.lv.Uc.^S COLOR 0 
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f^A SSiP"* ' ser ™ »' w »•> «« >" <s: ran* ran wwta.^'^', 

S3\EW J 7YFX 0 KE^DJNG •$CT*C1 l 9 XT <0 9 cttt »er ie- r -r_r-r, , * 

r^Xi^*€S fhctphat Be «, ' * T «'2 SXZE<g6,4S< mz^S TWT •Helvetica" ,565 OC&fift 0 

SCnEDtf 3 0 KLXDINC •Screen 3* AT 40 2 sirr 2t* ze» wrr* ^ 

ECRED^ 3 0 HEADING 'Screen 1- XT 40*2 *T?* cc< . _ 

f^BCxipticn and ^cleic SilfcK priuiL I ' ^ ,H ^"c.. .265 COLOR 0 
SCREEN 1 0 READING 'Scxec; !• XT 40 5 ft?" « ^e- . ' — 

?*Sx*en*l prct.ins:. . ^ <0 .' 2 " a 2EC ' 4£ < POTT 'Helvetic.' ,365 M/fr 0 

SCREEN 3 TVPt 0 hiADOT *S C I6«m 1- XT 40,2 2PF trwr* w^- . ■ 



? 1 EW2»05 



f^inr^s.^" " w 5,ji '•«•«' iw~.to-.2i, cote o 

SCJU3>' 1 THI 0 HEADING 'Ecreen 1' it « 5 et« noi ^ri - 

?^ai dative PhoEjrhoryJ.tiSi' * ' 12E 2E£,4S2 rOTLS ""^ •Helv«tic«'.365 COLOR 0 

taw s- 0 .;..c 

Piufliar jftCtfcbcliKn:' • ,2 £IIE 2B«^S* FMLS TCm a K*3v«tica* # 365 COLOR 0 

£CRm? 1 TOT 0 KEXDD33 'Screen 1' XT *0,2 eiTt 2 B fi iS2 rrvn c rrtum • 

«.t OH" iieJds ^.».F.».K. BBW .,:^^5^^ i ^;«^^ 0.0.0. 
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Uft CUT w ^**D*F»a-Jw©^i£,K5CRinOR,BG^ for ru *M' 

ecrebO .t»e o-se»wms •eer.rn^. at 40,a size :££.«: fxxels roar ''Gmw.7 color o.o.o" 
FSJie SSiiJSf?* ,!Ecreen S ' " " ,S £nr 5 "' 4 " ^ CG ^' "*? , H«iv.tic».ais COLOR 0 

ECKESifJ J ^^r,'! 6 ;' 01 r W 4 °' J PDCaB rCOT -Geneve', 7 ecu* 0 0 0 

list OrP *a«lfll ^^r,D,r.t,R.Z^.£.nOTaJTOR,BGFKE5.R?2ffi.WOT0.1 TOR R»'W ' 

SSif?""* *' * T 40,5 EI " "Msara* rcw -H«iv«ti«.,:es color 0 

ff M:D '^ TiTLf 22wVr C :"? 1 *' * T "' ; 5111 2t£ ' <£2 FDE - S "W 'Geneva* ,7 COLOR 0.0 0 
Hat OTP Hel* SlST^r ,D,Fi2,R,B?3W,£ 1 DE£CKIPT0R l E3rRE3,RrDlC|RATID, I TOR R='E' ' ' 

SCREEN 3 0 'Screen 1- RT 40.S EIZE SEC.4S2 FIXELS KUI 'Helvetica'. 3 68 COLOR 0 



? 1 H3SCEUANE0DE CATEGORIES' 

? 



SCREEN^ W** 0 K»MM3 -Scree 1- XT 40.S SIZS 26C,«ft PIXELS JOT 'Geneve*,? COLOR 0 0 ft 
lift OFT fields aw^x f D f F f £,^rmY^ 

f C ^Jt£Ki? ' £CTeCn *" * ? E1I£ 2 "' <S2 F1XtW 'Helvftici- ,3C5 CCUtt'O 

3 J*TE 0 KDXXNS -Screen J- XT 40,2 S1IZ 5BC.4S2 PIXELS FONT •Gcnevi' *? COLOR 0 ft n 
list OFF »^r.D,F f 2,K.B^^ 

7°^ei cTSw:^ 1 ^ " £Crt ^ ^ " AT 40)2 E2IE 3 "' 4K rOB * "Helvetica. 365 COLOR -0 

S?" 3 ** 57!?,.!? SS W3 ^"S e iT t,B *' * AT 40 ' 2 £IZE 3"-«2 POCELS • TENT * " Geneva * , 7 . COLOR 0 0 0 
list OFF U*l*s currier, D,F • £»^»C^nKy» S,nE£CWFTOR,BCFKE0,3Nj^© f RATIO, I TORR^X'^^ 

?^L^u^^;^ >' AT 40, 2 STZI 2Ee,<S2 KB* Rm 'Helvetic*' . 265 CUU* 0 

f^ 3 ^ 5TTfdf IS^n'STf* l ' AT 4 °' ; EI1E a "'" 2 FIXEL£ -Geneva* ,7 COLOR O,0' 0 

list OTF ii.ia* ™^r.P,?,*.J;.BmCY.S,BESCR2Fra^ rORR-'U' 
rjffiCXEE 



DO *T«»t print .prg» 

EXT PRINT C5F 

£XT DEVICE TO SCREEN 

QjOSE D*T?*EWS£S 

ERX£E OIPLIB-DBF 

p^5t IDffWJM.OBF 

CRXSt TXMTDES2C.DBP 

SET KXRCIK TO 0 

CUM 

LOOP 

D2TO0 
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SET ALTSSNAW W 

OTORE V7^(W(2))' TO FINT3ME 

FJQFZ rl^M - CTARTDir.TO GMPCEC 
£TORE TO CCKJNIH 



■err Hjj&m to 10 

•1,1 EAY •lAraiy Subtraction Analysis mix 6 £ £36 ^ .Geneva'*,:^ COLOR 0,0,0,-1, -1,-1 



7 

7 

; dateO 
77 TOdECi 

• 7?rCTR(27 7 TriATX, 5, 0) 
j? • through • • ■ 
' ?? 5^ ( TEWaKMZ, 6 , 0J 

7 Targe tl 
IP TaryetSo' 

n V _ 

77 TarpeW 
ZNDZ? 

?? • : 

77 Sferg*" t 
B3DI? : 
7 'Subtracting: 
7 Object! 
IF- Object^' 

7? -i 
77 Objects 

BTOiP 

IT Dbject3<>' 
7? 1 , 1 
77 Object! 
XNDIF • 

•7 •DesiP^ taonsr 

IT .AND, Hmatch=0 .AND. Owstch«0' .AND. IKATCK-0 

7? 'All* 

d©if , 

?? 'DcftCt* 1 
DTOF ' ' 
IF Hmatchal 
77 »H \fflan* 1 

•IP owtci^ 1 
77 'Other «p. • 
DCXF " 
IF imatcW 

?3 «iNcrJS' 

CNDI7 
;IP KM*\ . m 
7 'tort** *V ABUNDANCE 1 
E>©IF. 
IF AKAL-3 

7 'Arrant FUNCTION' 
BtQXF 
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? Total, cItoi r<F*ft«nt«ds 1 

? ■ Total xlonea An&lyzcg* 1 — 
? •Totel.CCTtpuution.tirw! 

!;ilr.™^ r. function . . ^ it , , . ^ 

ggi 1 .? 1 ° ? ««« AT 40.2 tan « Mfia FML s W •CceveVS COLOR 0 0 0 

CASE >TM*1 
?? STIv UUNlOUE f 4,0) 
■77 » ^cnes, for « tct*l cf ' • 
. 77 £TB<AN&WOT,X,0) ■ 

? . _ ' 

SCREEN 1 1YP2 0 KEXEUC 'Screen !• AT 40 2 «>T5r it* ttvrre . 
CLOSE DWAEAffiS 

.USE/fi^C^^cicSAErt/Maciica m« t clones. «bf 

CASE.7>JW*3 
• • airan^e/f unction 
SET yjtfWCN 
STT KE^D^flG CN 

. p» a „«.. hww, . BCTeen avw .x B m.«2 FOTL5 ^ colok c 

'J ' BINDING nOTETNS* 

7 .5ur**« »ci.ccl« and r^^cr.,' ' 4 " POTLS ?COT COLOR 0 

SCREEN 1 SYFE 0 'Screen 1" AT 40 2*rr?r 5ce /o cwte • 

ECKEBC 1 TOJE 0 *2AXIIWS 'Screen 1 ■ XT 40 2 «T?r -e* *e«5 -e-^t r . 

ii.t err ^^r.z.K.^s^c^^ o.o.o. 

BCRI33* 3 -0 K£\DING 'Screen 1* AT'40 5 r -rc /« W te . 

SCREDJ a O. HEADING 'Scrvm 1« AT .40 2 ffirr 2RC zc? t>tvrr e mm. 

ii.t oft — ^ . 0 , r , 2 .k. ^KvTs^e^^f li^r^^ . ;°^r^;":s . °- 0 ' 0 ' 

H.t OTT fields -^.D.F.I.iUIW.Si^^fiSS,^,^ °'°' 0 ' 
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FfifA SiMf?** ' -er ~ V * T 40 ' 2 «■ 3 ".<S2 »■» » -H.ta.ti a* ,265 COLOR 0 

PSLIJ^^LST * <0<! . EIZE J£€;4S: *™ «« •H«lv l ti'c.., 2 „ CO** O 

SCREO* 1 TiTL 0 HEADING •Screen 3* AT 40 2 srsr as« zc- r^vrr^ 

. aisc crr.fi^ ^. D .r.t.x,tS B ?,!!i^ , J^™ c.o.o, 
. T%!^V&J35Z2r~ 3 ' ^ 4C ' 3 £12E 2Sf< " m ■H.W-i.an color o 

K3U2N 1 TOfC KEfcDJKG .'Screen 1' AT 40. J Eirr jc* 453 cunt ry*-. , r 
ECKI3M J WE 0 KEfcDUJS "Screen 1» AT in 5 eirr s«< /c- rtrr, r . . 

2 ' 

SCREEN 3 0 HEADING •Ecreen 2 ' AT 40 2 rr?? st* ye- * T _ ' . 

? .^Ecxiption and Nucleic EiLbKdii P^in, ! ^ H,3v<Mtt, ' a « «£» 0 

SCREEN 1 2YFE 0 HEADING 'Scrtcr* 2' AT 40 5 « *e<5 n We ' . 

SC7J22J 2 0 HEADING •Jcrec, 2' AT 40 2 ct?t ice -r^, 

gCRCDf 1 WE 0 KLAEXN3 'Screen 2 f AT'4fi 2 C7?t 5t« ^ei ~. 

? 'Riboscwa prot.iasi* . . * T £IZE 2E «-<" PIXELS POTT *He2v»tict» ,565 CCttjok 0 

ECREnfl 1 0 HEADUC 'Screen 2» AT 40 2 Strr 3p* /c- mn P 

. rSJ'Si.iSS? • Sc " en »' AT <0 ' 2 rp^ mm- .ao^i^e color o 

SCREEN 3 0 l-SABEre »Scxeen 1* AT 40,2 £I2£ 2RF *C2 trwc ^ • 

U,t OFF Mid. »^.».M.K.I^ ? X2cJ^^ 

? * DTZYKES' 

SCREEN 1 TOT 0 HEADING 'Screen 1 • AT 4 D 3 cm so* r^v^ r, 

?■ •Jerrcpreteiirti • AT 40,2 SIZI -?6.4S2 FIXELS FONT 'Helvetic* • ,2C5 COLOR 0 

fCRED^ 2 *^7E 0 HEADING 'Screen 1» at 7 cT-r -joe . 

f^^anr^St^- 31 '^ »•* ««* ItW -He^tice-^fis COLOR 0 

fCRCN 1 TTPE 0 HE1ADD3C 'Screen !• AT dD 3 r 77 r ^ec j, C n t ^r . 

S2f« aw »^to8i^'ia^ .;..o. 

SO^ED? 1 TVTE 0 HEADING" 'Ccrecn 2' AT 40 5 <T-r ^o* it* r*v« ^ „ [ 

rVsacar jnctebcliKn: ■ • 1 AT *°' 2 £I<X 2B6 ' 4S < TCW •Ktlv.tlca- ,265 COLOR 0 
SCREX3* 2 TX7E 0 HEADING "Screen 2* AT 40 2 e? 7 r •jee jC n ~ 

a« off field. ^.i.T.zTtnms^^l*^ c.o.o. 

ECU* ! TYPE 0 HEADING 'Ecreen 1- AT 40.2 S JS 2E6.4S2 FIXERS FONT .C^ev,.., c^OK O'.O.O,' 
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U«t GTT fields nnber # D,F, Z.R.BflHy, S, DESCRIPTOR, fcGTREQ, RTDJD,RAT30rI FOR 

CCKHJO DTttDXHS "Screen AT 40.3 SIZE 2S6.IS2 FIXELS KNT -GeneW.7 COLOR 0,0,0* 

list, OFT fields na^ber ,D, Fi £,R,EN7Ry, S, DESCRIPTOR, KFKEQ,RTD*D, RATIO, I rOR R.'N'^^ 

r?3i3 StLSijSf?* ,£creen a> ^ <D#S £m 3 "' 4M ^ • Kelv * tic * l ^« color o 

ECREStf i T*FE 0 HEADING •Screen 1- AT 40,2 SIZE 26MS2 PIXELS FCNT -Cenra- 7 COLOR 000 

list off wr^r.D.r.t.R.enwr.fi.nsaaTO roxL-w' ' ' 

f C Si ^oJgf^ 2 ^ >Ecrra J ' * T 40 ' 2 * H ' m . roa ^ ra7r •»«lv.tiw\aeB color o 

TE?*! rogW 1' AT 40,3 SIZE SU^S2 FIXELS Km -Geneva \ 1 COLOR 0 0 0 

list OIT iieldf ws^r.D.r.X.^BOT.S.ra FOR )U'E' ' '° # 

SCREEN 1 «PE 0 EMIN3 /Screen 1> AT 40.3 SIZE 586,452 PIXELS 7WT -Helvetica- , 366 COLOR 0 

? 1 MISCELLANEOUS CATTDGORIES ' 

? 

,£CrW1 J ; ^ na: a "'"= W»" 'f«r -B.lv.tic .265 CCLOR 0 

FCKEEN-l TYPE 0 KEAD2N3 -Screen 2* AT 40,3 SIZE 26£.4S2 PIXELS FWT -Geneve* 7 color 'ft n r 
lift OFF faeldt ^er,D,F;Z,^TO^ °'°' 0 ' 

f^JcSS.- •' £CTe€n *' ™ 40 ' 2 EUE 2H,<S2 ' F1XtLS , H«Jv.tiet- i «5 COOR'O 

SCREEN 3 WE 0 ROWW 'Scrien !• AT 40,3 SIZE 366.4S2 PIXELS ' PWT -Geneve',? COLOR 0 0 0 
list OFF »u*btr.D,F i a,R.^ 

y^ei ^« 0 i^ 2M3 ;6e?t ^ ^ * AT " 40:S £1ZE 3 "' <52 JOaa ? ^ , »«lv.tica-.3H COLOR -0 

sam? 1 H^f^ 0 ^f 1 ^ •Screen 1- AT 40,2 SIZE 2fi£.4S2 PIXELS ' TW-Ceneva- 7 COLOR boo 
list OFF a^*.O.F,i.R.INTO.S.taCMI^ °' 0 - 0 * 

J^L^uL^^;^ l ' AT ' 40 ' 2 -Helvetic.-, 3S5 COLOR 0 

f^f^4 lillJ JSXEJVS^Jf' AT E1It 2 "'" 2 FIJttLS TOTT -Geneva', 7 COLOR *0 f 0j 0, 

last OFF field* aw^r.D.y.z.R.iww.s.rasciu tor r.«tj» v ' u ' u » 

DTCXSE 

DO *Te»t print .pre* 

SIT PRINT OCT 

EXT DEVICE TO SCREEN 

CLOSE DATABASES 

ERASE OTHPLIB.DBT 

X3y^SE 'TOdPN'dW.nBP 

ERASE HHFOES3G.DBF 

STT XABGIN TO 0 

CLEAR 

LOOP 
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•HDrthei» Uiagle) , vrrraicn 

close cUttJb&se; 

GET OFF 

EET FJUNP OFT ' 

EET I3CAC? OFT 

CL3AR ' 

STORE ' • to Ecbject 

STORE , 
STORE 0 TO MnS. TO **ject 

STORE 0 'TO Zog 
STORE 1 TO Bail 
DO KHXXjS .?. 
Program, i Northern (single) .fat 

• Date : 8/ 8/94 . * 

• Vcr«icft.i.roxB^/Jdac/r«rvi B icn i. 10 

• Xpt«s. . .Format file Northern (eujgiej 

e iixsls ais.se say .^try #?^e«S essaf ^i£'?r a56O0 '- l '-» 

f lis awjoisayH 1 i^s'i&s's&'s : : : :: 

W«)»ni liiaglu.M O-W.U COM .J- 

READ 

17 £ail«2 
OfiAR . 
screen 1 off 
'RTTOFN 
ZNDXF 

USB 'SmertOuyiFoxiBASE^/MacrPoc f ilea .i^v,^ -w*. 



IP Bo bjccto 1 

STORE UPPER (Eobject) to Eobiect 
SET Sfflf OFF J C 

SORT CN Entry to -Lookup er.try.dbf 
SET SA5ETY CN . • «Jr.oos 

USB a Loo)cup entry, fibf* 
LOCATE FOR LooJckEoH ect 
'2F..N0r.F0UND{) ' 

oan 

LOOP 

STCRZ Entry TO Searchval- 

CLOS* DATABASES 

23AS2 .»Loo)ccp entry, dbf 

doxf 

*tP*Dobj«cto' • 
6ET DtACT OFF 
GET BAFCTST OFF 

K£EOT£ iPt0r TO '^^'-"criptor.dbf. 
USE 'Lcofcup desariptcr.fibf ■ 

IJXA7Z FOR UPPER (TRIM (descriptor ) ) =UFPER f tb tv rrv^ 

IF .KDT.rOJNDO ^-ptur; ,-uFF£^(TFlI>< (Deject) ) 

CLEAR 
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LOOP 

SttXXF 

BROWS E 

STORE Batxy TO Searchvkl 
CLOSE DATABASES - . 
ERASE 'Lookup descriptor. dbf 1 
err EXACT ON 

IP NumfcMoO 

USE '6RortGitf:?C9£AEEf/MactFax fil« :clon«i.dbf * 

GO Narcfe 

BROWSE 

.CTPRE Entry TO Searchval 
CLEAR 

? 'Northern anelyalB for entry • 
?? Se^rcbval 
7 . 

? 'tocer V to proceed* 

WAIT TO CK • 

Q£AR 

IF UPPER (CJOo'V 1 
scre en 1 off 

SNDXF 

** CCW?RIS6iCN*SUBR0aTBffi FOR Library, dbf 
7 'Ccapreaeiag the Libraries filt cow;- . ' 

' fin f^5^ r ° xBXS£4 /hac : Fo * filtsilibririia.dbf 
SET SAFETY OFF 

SORT CN Jlhfory^TD 'ConpieBeed iibraries.dbf • 

• FOR catere6>0 
SET SAFETY CN 

USE 'Ccopreeied libraries, dbf 

OELCTE FOR enteredsO 

FAT* 

COUNT TO TOT 

kuuu « a 

Srf3»C . 

DO WHILE 5W3«0 ROLL 
TF HMtfp. >» TOT 

• PACK . 
EW2-1 
LOOP 
ZNDIF 

GO MARJU . 
* STORE library TO TESTA 
'SKIP 

STORE Libr ary to TESTE 
IF TESTA * Tl&'lV 
pPT ^TF 
ZNDZF 

LOOP " 
EHDDO ROLL 

* Northern an&lysis 
OB* 

7 'Ooiny the northern r.cw. . , 
SET TALK CN 

USE -SMrtCyyiFoxaASE+yKactFox' filc-s'iclenea.dbf »' 

SET SAFETY OFT 

COPY TO MtitB.dbf* FOR entry*»earchval 
SET SAFETY CN 
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• MASTER ANALYSIS 3; VERSION 12-5*54 

* Master menu for analyeia output 
CLOSE DATABASES 

SET TALK OFF 
SET SAFETY OF? 
CLEAR 

SET DEVICE TO SCREEN 



SET DEFAULT TO • SmartGuy : Pox£ASE+ /Mac : fox files. Output proe-ansi- 
USE -SmartGuy :FoxaASE*/Mac: fox files: Clones. dbf- Programs, 



GO TOP 
STORE NUMBER TO INITIATE 
GO BOTTOM 

STOKE NUMBER TO TERMINATE 
STORE 0 TO EOTIRS 
STORE 0 TO CQNDEtf 
STORE 0 TO ANAL 
STORE 0 TO EMATCK 
STORE 0 TO KMATCH 
STORE -0 TO OMATCK 
STORE 0 TO IMATCK 
STORE 0 TO XMATCK 
STORE 0 TO FRINTON 
STORE 0 TO PTF 
DO WHILE .T. 

* Program.: Master analysis. fmt 

* Date.,..: 12/ 9/54 

* Version. : FoxBASEiVMac, revision 1.10 

* Notes....: Format file Master analysis 



SCRSN 1 TYPE 0 HEADING "Screen !• AT 40.2 £12- 2fif trir-re cwm^ .~ - . 

6 PIXELS 39,255 TO 277,430 STYLE 26«7 to£or o7o?f f;Slso?^-X * ' 9 C ° U * °'°' 0 ' 

6 PIXELS 75,120 TO 178.241 STM 3871 COLOR 0. 6. ^i, -25600. ll.il 

<? PIXELS 27,96 SM 'Customiaed Output Menu" STYLE £553£ TCtTT 'Geneva' 574 rmnr, ft a , , , 
0 PIXELS 45,54 GET conden STYLE £5536 FCtn 'Chicaoc" 12 raura^SiE rlli COLOR 0,0.-1,-1,-1 
6 PIXELS 54,261 GET anal STYLE £5536 TO -cKS^.ia l&SS^.!|5 sSf^L5°?^/ S P 
C FIXELS 117,126 GET DUTCH STYLE 65536 FOOT^eSLi" 12 kSw -S^^S^m^ 1 ??^ 
6 FIXELS 135,126 GET HMW0H STYLE €5536 FOOT 'Chicago- 13 FICXTC "S-S SSSmJSPctI;*?. 6 ? 
6 PIXELS 153,126 GET ORTCH STYLE 65536 FONT "Chicane' 12 FIC?\RE •}•? £h£ £2? eS?,^:} 
« PIXELS 90,152 SAY 'Matches:' STYLE 65536 rCNT^otneva'^ee dOLOR 0 0 ^ f ? . SIZE 15,84 
6 PIXELS 63 , 54 GET FRINTON STYLE £5536 FONT 'CMcecS' 12 WCTME-mA t~{ , ■ , 

0 FIXELS 171,126 GET Inatch STYLE 65536 FONT ^hica?c' ]l2 S 'sV?™.^™ 
6 FIXELS 252,146 GET initiate STYLE 0 FONT 'Geneve "?12 MZT 13/70 COLOR ST I TVV 5 00 
6 PIXELS 270, 146 GET terminate STYLE 0 FONT "Geneva* , 12 Slffi 15 70 COLOR 0 D l i i i 
0 FEELS 234,134 SAY -^1^ clones . £TVLE 65S36 ^ -G^evl',12 ^R 0 0 < 
6 FIXELS 270.12SSAY ->■ STYLE 65536 FOOT 'Geneva-.M CCLoSTo -1 -?3 ^O.-I.-i.-l.-l 
8 PIXELS 196.126 GST FTF STYLE 65536 FOOT 'Chlcacb' 12 PICTURE *@*C Print' ta et-rr ,c . 

6 FIXELS 189,0 TO 257.120 STYLE 3871 COLOR 0.0.-l7-2«600 -1-1 15,> 
C FIXELS 209,8 SAY 'Library eelection' STYLE £5536 FONT ^Geneva" 266 COLOR 00111, 
e PIXELS 227,18 GET ENTIRE S^ 6553 6 FONT 'Chicago'!^ ?^'' e ^^v 0 Al^.8;l;chi^ S 1 «i 1 l< 



* EOF: Master analysis. fmt 
READ 

IF ANAL=9 

CLEAR 

CLOSE DATABA SES 
ERASE TEMFMASTER.D6F 

USE •STvartCuy-FpxBASE+/Mec:iox f iles: clones. dbf « 

SET SAITIY ON 

SCREa; 1 OFT 

RETURN 

ENDIF 
clear 
? INITIATE 
? TERMINATE 
? .CONDEN 

? m 
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? enatcfa 
? ftnatch 
? Cinatch 
? IMATCK 
SET TALK CM 

IP ENTIR£e2 
USE "Ufcique libraries /dbf • 

REPLACE ALL i WITH • ■ 

USE , BmartGuy:FoxBAS£+/Mae«fo* f,'u 0 „ lM ^, 

♦cow to *aS*w for MoKSfflinSi c lK'5£ 

*US£ TEMPNUM -UUTI ATE • AND . NUM5ER< sTERMDME 

copy erRucTUKE to tzmplib 

USE TEKPLXB 
IP EOTIRE-1 

»»>>F»f ■^rtGuy^o^^/^,,^ fileS!Clone8<abf . 

IP EOTIRE&2 
USE •Uaiejue libraries t <2bf • 

COPY TO SELECTED TOR UPPER H ). .y 
USE SELECTED ' 1 

STORE RSCCOUNT() to STOPIT 
MARXbI 

DO WHILE .T. 

IP HARK> STOPIT 

CLEAR 

EXIT 

S3DX? 

USE EJECTED 
GO MARX 

STORE ILbrery TO THISCNE 
? 'COPYING ' 
?? TSISONE 
USE TEMPLIB 

^JSiroSS^' raaMt ^* e * *ia«.a BB ...dbf • TOR library.™*** 
LOO? 
S©DO 
ENDI? 

COUNT TO &TARTOT ■ 3 .clones. dbr' 

COPY STRUCTURE TO THKP£ESIG 
USE SMBSSXG 

SreS'SS Skf"* 0 ■ AND - '•**>• XM.TCH.0 

ENDIF 

IF ESnacch»l 

APPEND PROM TEMPLIB FOR D='£' 
£5©IP 

IF Kmnrchcl 

APPEND PROM TDIFLI3 FOR D='H' 
D3DIP 

IP Ctnacchel 

APPEND PROM TEMPLIB FOR D«»0» 
ENDIF 

I F Xra atchal 

gFEHD niDM TOiPLIB FOR D= . I . .OR.* ' X ' .OR.D- 'N' 
IP Xnatchnl 

APPEND PROM TEMPLIB FOR D«'X' 

ENDIF 
CCUOT TO ANAI/TOT 
set talk off 

DO CASE 
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CASE PTF»0 

SET DEVICE TO PRINT 

SET PRINT ON 

EJECT 

CASE PTTel 

SET ALTERNATE TO 'Total function ecrt.txt' 

•SET ALTERNATE TO e H and 0 function sort tx* - 

•SET ALTEWCATC TO -Shear Stress K7VEC 2:ibur.darc€ sort tact- 

♦SET AI/PERHATO TO "Shear Stress KUVEC 2 cof 

♦SET ALTOWOT TO •Shear Stre« H0VEC 2:S S r '^' 

•SET ALTERNATE TO •Shear stress JfUVEC l:Clone list.txf 

HTOCASE 

IF FRINIWsl 

•WO SAV -Database Subset Analysis' EW £5336 KOT COLOR 0, 0, 0,-1,-1,-1 

? 
7 

? 

7 dateO 
?? 1 ' 
?? TMB{) 

? 'Clone* numbers ' 

77 STR (INITIATE, 6,0) 

7? 1 thrown • 

7? STR ( TERMINATE , 6,0) 

7 'Libraries: • 

IF £NTXRE*1 

? 'All libraries* 

IF EOTIREc2 
HARK.1 

DO WHILE .T. 
IF MARK>5TOPIT 
EXIT 
ENDIF 

USE SELECTED 
GO MARK 
7 ' • 

77 TRIHMibname) 
STORE MARX+1 TO MARK 
LOOP 
IMDDO 
ZNDIF 

? 'Designations: ' 

77 ^Il? hB ° #AND# Kmatch«0 .AND. Ctoatch=0 .AND. IMATCH-O 

IF Ehatchsl 
7? 'Bcaet, ' 
BSDIF 

IF Kmatch»l 

77 'Human,* 

ENDIF ' 

IF Oratchcl 

7? 'Other sp. 1 

D©IF 

IF Ixnatchtl 
7T 'INCyTC* 
ENDIF 

IF Xmatch=l 
?? 'EST* 
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ENDIF 

IF CONDOM 

? • Condensed format analyeia' 

BSDIF 

IF ANALel 

? 'Sorted by NUMBER 1 

ENDXF 

IF ANAL=2 

? 'Sorted hy ENIW 
XF AKAL=3 

? 'Arranged toy AKJNDAMZE' 

ENDIF 

IF ANAL* 4 

? •Sorted by INTEREST' 

INDIF 

I? ANAL* 5 

? 'Arranged by LOCATION' 
B©XF ' 
IF AN&L-6 

? 'Arranged by DISTRI3UTI0K 1 

F?»ff?7F 

IF ANAL* 7 

? 'Arranged by FUNCTION' 
HTOF 

7 'Total clones represented: ' 

?? STO(STAR3CT ( 6.0) 

? 'Total clones analyzed: • 

?? STR<ANALTOT» 6# 0} 

? 

J '1 - library d * detonation f * distribution z = location r - function c . car 

USE TEMPDESIG " * ••••***•******•*••****•••*•••*•*•♦♦■•*##*•.•«-*«. 

SCREEN 1 TYPE 0 HE&DIN5 "Screen !• AT 40 2 ct?f ojsc BTWfl ' 

DO CASE ' 286,452 PIXELS FOWT 'Geneva ',7 COUSR 0,0,0, 

CASE ANfcLri 

* eort /number 
SET HEADING ON 
IF COJDOJ*l 

SORT TO TEMPI CN ENTRY, NUMBER 
DO •COMPRESSION number. PRG" 
ELSE 

SORT TO TO£P1 ON NUMBER 
USE TCMP1 

liat off fields number, L,D,F, 2, R,CENTRY,S, DESCRIPTOR 

ERASE TEMPI. DBF 
QTOF 

CASE ANfcU2 

* eort/DESCRIPTOR 
SET HEADING ON 

•SORT TO TEMPI ON DESCRIPTOR, ENTRY, NUICER/S for D='E' .OR D*'F' OR D-»o« op twv. ™ r^.* 
•SORT TO TEMPI CK ENHIY, DESCRIPTOR, NUKEER/S for cL'£' qr.'d! 'H' ' 54 'm D 5 *£p ^ X 

DO •COMPRESSION entry. PRO- 

VI Aft 

USE TEMPI 

cS£ SJaSS? "^'^'^'^k^^ 

ERASE TEMPI. DBF 
ENDIF 
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CASE AKXI^3 

* sort by abundance 
SET HEADING CN 

SORT TO TEMPI ON B7I3Y,NUK=ER for D*'E' OR n-»«> nt n. r«> r^.«. ~ 

DO 'COMPRESSION abundance . PRC* ■CR.D.'O' .OR.B»'X\OR.D.'I» 

CASE ANAL.4 

• sort/interest 
SET HEADING CN 
IF CQNDENel 

SORT TO TEMPI ON ENJEY, NUMBER FOR I>0 

DO 'COMPRESSION interest . FRG B 

ELSE 

SORT ON I/D.ETOTY TO TEMPI FOR I>1 
USE TEMPI 

ERASE TEMPI. DBF 
INDIF 

CASE AKALcS 
* arrange /location 
SET HEADING CM 
STORE 4 TO AMPLIFIED 
? 'Nuclear i • 

SORT ON B7TRY,NtE4EER FIELDS RFEMD NUyore t n r •? b m 

IF CCNDEN=1 ^ ^^^'-'D' 7 ' 2 '*' 0 ' 1 ^*^^ 

DO •Compression location. prg' 

ELSE 

DO "Normal subroutine 1* 
EKDIF 

? 'Cytoplasmic: 1 

SORT CN D7I7*Y,NUM3ER FIELDS BFTND folTMRTfc t n r » « « „ - 

DO 'Compression location. prg' 
ELSE 

DO •Normal subroutine 1* 
EKDIF 

? 'Cytoskeleton: » 

SORT CN DSTRY,NUIffiER FIELDS RFDSD.NUM^tr t n r 9 t r T -* T -,^ J _ 

DO ^Compression location. prg* 
ELSE 

DO 'Normal subroutine !• 
DB1F 

? 'Cell curface: 1 

SORT CN B7TRY, NUMBER FIELDS RFEMD,NUMEER.L D F 2 R r t-ktov c nrerer^ . 

IF CONDOM Wrt ^ # ^' c ' r ' z ' R ' C '^^-'5,E£SCRIPTOR,LHN^,IN^ 

DO *Conpr«3sion location. prg" 
ELSE 

DO •Normal eubroutine 1* 
EKDIF 

? 'Intracellular membrane: 1 

SORT CN 2NTRY, NUMBER FIELDS RTDID.NUME2R L n r 9 d r rwrov; r. r^r^*™.- 

IF COEEN=l ^ yiN ^ tR,L ' C ' F ' 2 ' R ' C '^^V ( S,DESCRIFTOR,LE^ 

DO 'Compression location. prg" 

DO 'Normal subroutine 1* 
ENDIP 

? 'Mitochondrial: • 

DO 'Compression location. prg* 
ELSE, 

DO 'Normal subroutine 1" 
ENDIF 
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? 'Secretcdi ' 

DO •Canpfeaaion location. pre/* 
ELSE 

DO 'Normal aubroutine 1* 

EWDIF 

? 'Otheri' 

g R ?oS^ V ' NlJ ^ FIELD£ ^'"^^'^^ 

DO "CcraprfiSfiion location-prc' 

ELSE 

DO •■Normal subroutine l 1 
2NDXF 

? 'Unknown: ' 

DO ■Concession location .prg* 
ELSE 

DO •Nonntl aubroutine !• 
ENDIF 

77 CCNDDJel 

SET DEVICE. TO PRINTER 

SET PRTKTTR ON 

EJECT 

DO •Output headinc. org' 
USE "Ana-lysis location. dbf 
DO "Create barcraph.prg* 
SET HEADIKG OFF 

? • xVNCTIONAL CLASS TOTAL UKIQUI KZW % TOTAL 1 

LIST OF? FIELDS 2 , NW£E , CLONES , GD'TZS , NEW , FERCDJT GRAPH 
CLOSE DATABASES ' 
ERASE TEKP2.DBP 
SET HEADING ON 

•USE •SmarzCuyiFoxBASS*/M £C :fox file* :TEKFMAffTER.dbf 
EUDIF 

CASE ANAL* 6 

* arrange/distribution 

SET HEADING CN 

STORE 3 TO AMPLIFIER 

? 'Cell/tiaeue specific distribution: 1 

SORT^^NUMBER FIELDS RFQJD, NUMBER, L,D,F, 2, R, C, D7TRY, S, DESCRIPTOR, LENGTH, INIT, I ,CQhWD? 

DO "Conpression diatrib.pry* 
ELSE 

DO 'Normal tubroutine I* 
ENDIT 

? 'Non-cpecific distribution: • 

SCSn^OHUENTW, NUMBER TOOB XrnTO.NUMBSR.L.D.F.Z.R.C.^^ 
DO •Conpreasion dietrib.prg" 

FT iSP 

DO "Normal subroutine !• 
? 'Unknown distribution: 1 

DO 'Ccnpreeaion distrib.prg* 
ELSE 

DO *Koxzral subroutine a* 
D4DIF 

TF OONE©Jel 

SET DEVICE TO PKD7TER 

SET FRINJTR ON 
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EJECT 

DO 'Output heading. pre' 

USE •Analysis distribution. dbf 

DO ■Create bargreph.pr^' 

SST HEADING OFF 

? • FUNCTIONAL CLASS 

7 . ICTAi, UNIQUE % TOTAL' 

LIST OFF FIELDS P .NAME, CLONES , GB^ES , PERCENT GRAPH 
CLOSE DATABASES *"*-nteJw\ GRAPH 

ERASE TD3F2.DBF 

SET HEADING ON 

^•SinartCW.'FoxBASE^/Macrfox files :ra ! ft ft gx«. dbf . 

CASE ANAL«7 

* arrange /function 
SET READING ON 
STORE 10 TO AMPLIFIER 

* ' BINDING PROTEINS' 
? 'Surface molecule* and receptees" 

SORT ON D7TRY, NUMBER FIELDS RF?ND NIJMPPR t n r- , « ~ ~ 

IF CONDENsl ra ' M ^- UD - r ' Z ' R ' c ™^S ( MSC M Wm^ 

DO •Compression function.prc* 

ELSE 

DO B Nono&l cubroutine 1 B 
DffilF • 

? 1 Calcium- binding proteins: 1 

SORT W StiTRY » NUMBER FIELDS RTEND Mtmppc t n * n * * 

DO •Conpressicn function.prc- ■ 
ELSE 

DO 'Normal subroutine I s 
B5DIF 

? 'Licands &nd effectorst* 

SORT ON DTCRY, NUMBER FIELDS RFDffi NUMRt-p r n - * * ~ 

DO •Compression function, pro;* ' ' 

ELSE 

DO "Normal subroutine 2* 
ENDIF 

7 'Other binding proteins: 1 

SORT CN ENTRY, NUMBER FIEtifc RFHCD NtlM&ws 

IF CONDENsl ^FaJD, NUMBER, L,D, F, 2, R, 0,DTOlY, S, DESCRIPTOR, LEJ^5^ ( ^NIT* I, COMMDJ 

DO * Compress ion function, pro" 

Truer. 

DO "Normal subroutine !• 

DJDIF 

•EJECT 

* ' ONCOGENES' 
? 'General oncogenes* 1 

SORT CN iNTRY, NUMBER FIELDS RFBsm nttmbpp t n » « « 

IF CONDOM ^^'^^ L ' D ' r ' Z ' R ' C '^^S^ESCRIPTOR ( l^x^ 

DO •Compreasioffi function, pre;* 

ELSE 

DO 'Norma 2 subroutine 1" 
ZHDIF 

? 'GTP-b4nding proteins i 1 

SORT CN BTOiY, NUMBER FIELDS RFEND Nnyapr r n P „ « « 

IF CONDOM '^ ro ' L ' D ' F ' 2 ' R ' C '^ Y ' s < D ^^^ 

DO "Corapreasion function. pry - 

ELSE 

DO 'Normal subroutine 1 • 
ENDIF 

? 'Viral slementsi '. 
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SOKrW^V.WKEER FIELDS ra^^uc.^s^^^^^ 

DO 'Compression function. prg" 
ELSE 

D O 'N ormal subroutine 1" 
ENDIF 

? 'Kinases end Phosphatases: ' 

SORT ON ENTRY, NUMrER FIELDS RFEND.NUMEER LDMRr PNTftv e nrcrtJTwv^ . 

DO 'Compression function. prg' 
ELSE 

DO 'Normal subroutine 1" 
ENDIF 

? 'Tuner-related antigens i ' 

SORT ON SNTRY, NUMBER FIELDS RFZND . NIM=ER r n r 7» r rv*rsv e r^r^-r^^ 

DO • Compress ion function, prg' 
ELSE 

DO 'Normal subroutine 1* 

SNDIT 

♦ZJECT 

J ' FROTEIN SYNTHETIC M&CHIK2RY PROTEINS' 

? 'Transcription and Nucleic Acid-fcir.dins proteins:' 

SORT QK EOTRY,NUMEER FIELDS RFOID NUMEER L D F Z ft P r*r^pv e np e «T^ e ^ 

IF CCNDENcl i-tw^ur^K, jj,D,r, £,R f C, ^3TRY,S, DESCRIFTOR^LDJGTK, JNIT, I,OQhMEN 

DO 'Ccanpression function. prg* 

ELSE 

DO 'Normal subroutine 1' 
DJDIF 

? 'Translation: ' 

DO 'Compression function. prg' 
ELSE 

DO 'Normal subroutine l» 
EWDIF 

? 1 Riboscnal proteins : * 

DO 'Compression function. prg" 
ELSE 

DO "Normal subroutine 1* 
ENDIF 

? 'Protein processing i ' 

DO 'Compression function. prg". 
ELSE 

DO •Normal subroutine 1* 

BffilF 

* EJECT 

? 1 ENZYMES' 
? 

? 'Ferroproteinsi 1 

DO ■Compression function. prg* 

DO 'Normal subroutine 1* 
XNDIF 

? 'Proteases and inhibitors: 1 

DO 'Compression function .prg* 
ELSE 
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DO •Nornal subroutine 1* 

? 'Oxidative phosphorylation: ■ 

SORT ON EOTRY, NUMBER FIELDS RFD© NUMftrc t n r> » « « - 

DO 'Compression function. pro* «wn 
ELSE w 
DO 'Normal subroutine 1* 
ENDIF 

? 'Sugar 'jnet&bolifimi 1 

SORT ON D?TRY,NUMEER FIELDS RFEMD MTT^trtp t n n * ~ 

IF CQNDEN-1 ^ ' m ^' L ' D ' F ' Z ' R ' C '^^^DESCRIWOR,l^ ra<TOT/ i <C( ^ 

DO •Conpreseion functionary* 

ELSE 

DO •Normal subroutine I 9 
DOXF 

? 'Amino acid metabolism; ' 

SORT ON OTKY,NiaESR FIELDS RFDTO,NUK=ER r n r 9 t> r * 

DO •Compression- function. prg # 
ELSE 

DO 'Normal subroutine I* 
BOX? 

? 'NUcleic acid metaboliami • 

SORT ON ENTRY, NUMBER FIELDS RFEND NUvfto t 

IF CONDOM TOil ^' L ' fl ' r '*- R ' C ^^^ 

DO •Compression f unction. prg' 

ELSE 

DO *Nermal subroutine i» 
ENDXF 

? 'Lipid metabolism: ' 

SORT CW DTOIY, NUMBER FIELDS RFSND MtMru t ^ ^ „ ~ 

DO 'Conpresexon function .pre" 
ELSE 

DO •Normal subroutine i» 
DTCF 

? ' Other enzymes i ' 

SORT ON ENTRY, NUMBER FIELDS RFSND Miwnro t n * * • - _ 

DO •Compression function .pre B 
ELSE 

DO 'Normal subroutine 2" 

Q3DXF 

♦EJECT 

] ' KXSOXI-WIE0U6 CATEGORIES' 

? 'Stress 'response; 1 

SORT ON DTOIY # NUKEER FIELDS KHlCr mjmppo r ^ * • - 

DO •Compression function. pry 
ELSE 

DO 'Normal subroutine !• 
ENDIF 

? 'Structural;' 

SORT ON BfTRY, NUMBER FIELDS RFEND mump re t n r> * « ^ 

DO 'Compression function .prg B 

f i-^t 

DO 'Normal subroutine I ■ 
EMD1F 

? 'Other clones!' 

SORT ON SfTRY, NUMBER FIELDSRFEMD NUMR?p t r> * „ ^ „ 

DO 'Compression functi n.prg' 
ELSE 
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CO # Nocnal subroutine I s 
2NDIT 

? 1 Clones of untaown function:' 

DO 'Conprefifiion function .prg # 
ELSE 

DO 'Noim! subroutine l a 
D©IP 

IF CCNDBtel 
EJECT 

•SET DEVICE TO FRINTER 

•SET PRINT GN 

DO •Output heading .pre;' 

*»• 

USE 'Anslyeis rur.ction.c3bf • 
DO •Create barcreph.prg* 
SET HEADING OFF 

SCREW 1 IYFE 0 HEADCCG "Screen 1* AT 40 2 eizz 5fi* ^o-? m-te .~ 

A AA wIz t x ' 2 P 6 ' ^-^ PIXELS FONT 'G&nevaM2 COLOR 0,0,0 

? • 

? • FUNCTIONAL rr^cc ^ ^„ TOTAL TOTAL NSW DIST 

• ? . ^ CLONES GENES GENES FACTIONAL CLASS ' 

•LIST OF? ^NAME,CMe^,G9ra,MW^ CQ-SANY 

LIST OFF FIELDS ? , NAME. CLONES . GENES . NEW , PERCENT 6raP>' 
CLOSE DATAEASES ' r ' 

ERASE TEKP2.DBF 
SET HEADIN3 ON 

*USE •^artGuy :FoxBASZ*/Mac»fox files :TD3PMASTD* db* # 
D©IF * * * 

CASE ANfcUe 

DO 'Subgroup coronary 3.prg" 
HOCCASE 

DO •T©«t print. pry 
SET PRINT OFF 
SET DEVICE TO SCREEN 
CLOSE DATABASES 

* ERASE TTMPLIE.DBF 
•ERASE TIMPNUM.DBF 
•ERASE TE MPOESI C.DEF 
•ERASE SELECTED. DBF 
0£AR 

LOOP 
3CED0 
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* COMPRESSION SUBROUTINE FOR ANALYSIS PROGRAMS 

USE TZMP1 

COUNT TO TOT 

REPLACE ALL KFEND WITH 1 

MAMC1 = 1 

SW2-0 

DO WHILE SM2*0 ROLL 
IF MARK1 >» TOT 
PACK 

COUNT TO UNI 002 

COUNT TO rOR Ds f H\OR.D= '0' 

SW2»1 

LOOP 

EMDIF 
GO MARJU 
DUP * 1 

STORE TO TESTA 

SW » 0 

DO WHILE SWsO TEST 
OTP 

STORE ENTRY TO TESTS 
IF TESTA * TESTE 

DUP = DUFtI 
LOOP 
ENDXF 
GO MARKi. 

REPLACE RFEND WITH DUP 
MARX1 ■ HARJU-rEUP 
SW=1 
LOOP 

ENDDO TEST 
LOOP 

EMDDO ROLL 
•CO TOP 

STORE Z TO LOC ' 

USE •Analysis location. dbf 

LOCATE FOR Z«LOC 

REPLACE CLONES WITH TOT 

REPLACE QmZS WITH UNIQUE 

REPLACE NEW WITH NEWGENES 

use raci 

SORT ON RFEND/D TO TD4P2 

USE TE24P2 

?7 STR(UNIQUE»5,0J 

?? * oenee, for a total of • 

?? STR(TOT,5, 0) 

?? 1 .clones ' 

? ' V Coincidence 1 

list off fields ™*toer l ROTa,D,r,2,R,C.a 

•SET PRINT OFF 
CLOSE DATA3ASES 
ERASE ra«?l.DBF 
ERASE TC4F2 .DBF 
USE TEMFDESIG 
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» COMPRESSION SUBROUTINE for analysis programs 

USE TEMPI 

C0UT7T TO TOT 

REPLACE ALL RFEND WITH 1 

MARX1 * 1 

SW2-0 

DO WHILE SW2«0 ROLL 
If MARJC1 >= TOT 
PACK 

COUNT TO UNIQUE 

SW2*1 

LOOP 

DJDXF 
CO MAKKl 
PUP c 1 

STORE DJTRY TO TESTA 
6W • 0 

DO WHILE SW=0 TEST 
SKIP 

STORE ENTRY TO TESTS 
IT TESTA w TXSTB 
DELETE 

rap * dup*i 

LOOP 
INDIF 
GO HARK1 

REPLACE RFD3D WITH EOT 

KUUQ « MARXl+IOT 

SWnl 

LOOP . 

ENDDO TEST 

LOOP 

ENDDO ROLL 
•EROWBE 

•*SET PRINTER ON 
SORT ON EATE TO TD4P2 
USE TZH72 
?? STR (UNIQUE, 4,0) 
?? 1 genes, for a total of* 
?? STR(TOT,4,0) 
* donee' 

7 

? • V Coincidence* 

COUOT TO Pi FOR 3»4 

IF P4>0 

? STR(P4,3,0) 

?? • genee vith priority s 4 (Seccndarv analysis:) 1 

list off fields ™=>ber,Rra©,L,D,^ for 3*4 

£NDEF 

COUNT TO P3 TOR I«3 

IP P3>0 

? STR(P3,3,0) 

?? • genes with priority « 3 (Full insert sequence:)' 

list off fields ™^r.RraDa.D,F.Z,R,C,H7TRY, S ^ £ox 3e3 

DJDIF 

COUOT TO P2 FOR 1=2. 

IP P2>0 

? ST1WP2,3,0) 

?7 • genes with priority * 2 (Primary tnclysia ecrrplete-) 1 

list Off fields number, RFIUD^L, D,F,Z, R,C,D\TRY,£, DESCRIPTOR, LINGTO, INIT for 3-2 
ENDIF 

COUNT TO PI FOR 1*1 
IF FlxO 
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? STR(P1,3 ( 0) 



"at f- fc fiel2 S£^4i v*^* 



♦set print opt 
close databases 
erase tempi. dbf 
e*as2 temp 2 • dbf 

USE 'SmartGuyiFoxBASE^/Mccifox 



tilts .-clones. dbf f 



70 



WO 95/20681 



PCT/US95/01160 



COUNT TO TOT 

REPLACE ALL RJEMD WITH a 

MARX1 m I 

5W2«0 

DO WHILE SW2.0 ROLL 
IF MARK1 >s TOT 
PACK 

COUNT TO UNI WE 

GW2«1 

LOOP 

ENDIF 
GO MARK1 
DOT e 1 

STORE ENTRY TO TESTA 
SW « 0 

CO WHILE SW-0 TEST 
SKIP 

STORE D7PRY TO TESTE 
U TESTA m TESTE 
EELETE 
CUP » EOT* I 
LOOP 

d®ip 

GO MARK1 

REPLACE RFMD WITH DUP 
MAWU c MAR3G*WP 
SW=1 
LOOP 

D3DD0 TEST 
LOOP 

DODO ROLL 
• BROWSE 

•SET PRINTER ON 

SORT ON NUMBER TO TEHP2 

USE TD4P2 



7? STR (UNIQUE, 4*0) 

7? ' penes, for e total of » 

?7 STO(TOT,5,0) 

77 » clones 1 

list off fields r.urtber,RT^^ C D d ? n rR r tvnDv . 

'^^•^F,Z,R,C.Q7m,S,DESCRI^^ 

♦SET PRINT 0-T 
CLOSE DATABASES 
ERASE TOOT .DBF 
ERASE TQ*?2 .DBF 

USE •SmartGv V iFoxBA£E T /wac:fox files iclonea.dbf 
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• COMPRESSION SUBROUTINE FOR AJ&LYSIS f ROGRAMS 

USE TEMPI 

OOUNT TO TOT 

REPLACE ALL RFEND WITH 1 

MARJQ - 1 

SW2*0 

DO WHILE SW2sO ROLL 
IT MARK1 >• TOT 
PACK 

COUNT TO UNIOTE 

COUNT TO NEV&ENES FOR D='H' .OR. De »0' 

6W2«1 

LOOP 

go harki 

DOT - 1 

STORE DOW TO raSTA 
SW • 0 

DO WHILE SWsO TEST 
SKIP 

STORE ENTRY TO TESTS 
IF" TCSTA = TESTE 
DELETE 
DOT = DUF*1 
LOOP 

bsdif 

GO MARK!" 

REPLACE RFEND WITO DOT 
MARK1 - >^ua*DUP 
SW»1 
LOOP 

DJDD0 TEST 
LOOP 

ENDDO ROLL 
00 TOP 

STORE R TO FUNC 
USE 'Analysis f unction. dbf» 
LOCATE FOR P=FUNC 
•REPLACE CLONES WITH TOT 
REPLACE GENES WITH UNIQUE 
REPLACE NEW WITH NEWGDJES- 
USE TEMPI 

SORT ON RFIND/D TO TEMP2 

USE TEMP2 

SET HEADING CM 

?? £TRfUNICOE,5,0) 

?? 1 genes, for a total of 1 

?? STO(TOT,5,0) 

?? 1 clones ' 

* ' . V Coincidence • 

list off fields »unte,ire»,L i D # r f I f R,c.n^ 

•SCRED* 3 TYPE 0 HEADIN3 •Screen 1 • AT 40 2 sirz 5fifi c-rwre 

♦list cff fields Rrom. £# DEscRiPioR 6 ' 492 PIXEIJS GeneVftl ' 13 o ( o ( 

♦SET PRINT CFF 
CLOSE DATABASES 
ERASE TEXF1.DBF 
ERASE TDff2.DBF 
USE TEKPDESIC 
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• COMPRESSION SUBROUTINE F03 ANALYSIS PROGRAMS 

usb 

COUNT TO TOT 

REPLACE ALL RFD© WIW 1 

KAJW1 b 1 

SW2*0 

DO VKILE SW2=0 ROLL 
IF MARK1 >» TOT 
PACK 

COUNT TO UNIQUE 

LOOP 

ENDIF 
GO MARK! 
DUP « 1 

STORE ENTRY TO TESTA 
SW « 0 

DO WHILE SW=0 TEST 
SKIP 

STORE D7TRY TO TSS TO 

IF TESTO • TESTE 

PKIfOT 

DUP c E0P41 

LOOP 

MDIF 
00 MARJU 

REPLACE RFBO WITH DUP 
KARJQ * MASK1+DUP 

LOOP 

ENDDO TEST 
LOOP 

DSDDO ROLL 
GO TOP 

STORE F TO DIST 

USE •Analysis distribution. dbf 
LOCATE FOR PrDlST 
REPLACE CLONES WW TOT 
REPLACE GENES WIW UNIQUE 
USE TEMPI 

sort on rfend/d to TD4P2 

USE TEMP2 

?? STR(UNIOUE,5,0) 

77 ' genes, for a total of • 

77 ETR(T0T # 5 # 0) 

77 • clones* 

7 1 V Coincidence' 

list off fields »u^r.l«a©,L,D,F,* ( R f c,B 

*SET PRINT OFF 
CLOSE DATABASES 
ERASE TB4P1.DBF 
.ERASE T2MP2. CBF 
USE TOOT2ESIG 
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♦ coMPfireejaN subroutine for akmvctc 

USE TEMPI ANALYSIS FR03RAME 

COUNT TO TOT 

REPLACE All/ RF2ND WITH 1 

MARX1 - a 

SW2-0 

DO VMILE 5W2*0 ROLL 
IF MARK! >- TOT 
PACK 

COUNT TO UNIQUE 

SW2.1 

LOOP 

GO MARXl 
CUP • 1 

STORE DJTRY TO TESTA 
SW - 0 

IX) WHILE SW=0 TEST 
SKIP 

STORE ENTRY TO TTSTE 
IF TES TA e TESTE' 
DELETE 
OTF.c CttP+1 
LOOP 
QB2F 
GO MARK1 

REPLACE -RFEND WTOJ CUP 
MARK! - MAKK1*DUP 
SWsl 
LOOP 

DfDDO TEST 
LOOP 

ENDDO ROLL ' 
GO TO? 
USB TEMPI 

?? sroruNiouE,5,o) 

7? ' genea, for a total of • 
?? 5TR(TOT,5,0) 
?? ' clonea' 

ii«t off fields nurhte^R^S 1 ? 6 ! 6 ?^: r w 

•SET PRINT OFF - 
CLOSE DATABASES 
ERASE TEMPI .DBF 
USE TCMPDESIG 
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• COMPRESSION SUBROUTINE FOR ANALYSIS PROGRAMS 

COW TO rSMPI TOR 
USE TEXPi 

COUNT TO IDGENE FOR D» ' £ • . OR . Dc ' 0 ' . OR n-'H' or tv-wi r*> r^.« 

SS? 1 m B -'"- o, - 0 -" , '- OT -»-*'-^.S.:S:£.?::S:£:s::S:J::j:. 01i . Ib . v . 



COUNT TO TOT 
REPXACE ALL RF3JD WHS 1 

Mwa « i 

SW2eO 

DO wwtt.f £W2«0 ROLL 
IF HARK1 >= TOT 
PACK 

COUNT TO UWICOT 

SW2=1 

LOO? 

ENDIF 
GO MAK3Q 
DUP ■ 1 

STORE D&TRY TO TESTA 
SW « 0 

tO WHILE SW«0 TEST 
SKIP 

STORE E7TRY TO TESTB 
IF TESTA c rESTS 
DELETE* 

dup • rup4i 

LOOP 

endif 

GO HARK1 

REPLACE RJXND WITH DUP 
MARX1 m MARXl+OTP 
SVfcl 
LOOP 

DJDDO TEST 
LOOP 

ENDDO ROLL 
•BROWSE 

*SE7T PRINTER CN 

SORT ON RFDSD/D, NUMBER TO TEXP2 
USE TEMP2 

REFLACE ALL START WITH RFEND/ItGD*E»10000 

?? srorunouE,5,0) 

?? ' peats, ior c total of ' 
?? STO(TOT,5,0) 
?? 1 clones 1 

? * Coincidence V v Clraes/10000 1 

s t heeding off 



SCREEN 1 TYPE 0 HEADIN3 'Screen V AT 40 2 st*p ooc * m „ tv -~ „ 



CLOSE DATABASES 
ERASE TEMPI, DBF 
ERASE TEMP2 .DBF 

USE •SmmGvy:FoxBASEt/Nac:fox filesrclones dbf 
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* COtfPKESSICN SUBROUTINE FOR ANALYSIS ^RDG^AVt; 
USE TD4P1 

COUNT TO IDGENE FOR D*'S' .OR.Dr'O' .OR.D* «H' or tw.m. ™ ru, B , ^ ~ 

pc* "■■"■^•^•"•^•A-at^as.s.assaM:.....^ 

COUNT TO TOT 

REPLACE ALL RFEND WITH 1 

MARKl c 1* . 

SW2«0 

CO WHILE SW2*0 ROLL 
IF MARK! > e TOT 
PACK 

COUNT TO UNIQUE 

SW2=1 

LOOP 

ENDIF 
GO MARJQ 
DUP » 1 

STORE DCRY TO TESTA 
SW • 0 

DO WHILE SW«0 TEST 
SKIP 

STORE TO TESTE 

IF TESTA = TESTS 

DELETE 

COP - DU?*1 

LOOP * 

SNDIF 
GO MARXI 

REPLACE RTDCD WITH DUP 
MARKl - KARXItDUP 
SW*1 
LOOP 

ENDDO TEST 
LOOP 

ENDDO ROLL 
•BROWSE 

♦SET PRINTER ON 

SORT OH RFEXD/D, NUMBER TO TD1P2 
USE TO*?2 

REPLACE ALL START WITO RJIND/irCENE^OOOO 

?? STO (UNIQUE, 5,0) 

7? ' genet, for a total of • 

7? STR(TDT,5,0) 

77 ■ clones' 

7 • Coincidence V v Clones/10000 • 

eet heading off 

SCREEN 1 TYPE 0 HEADING 'Screen 1* AT <C 2 err? ->cc ~ 

CLOSE DATABASES 
ERASE TEMPI. DBF 
ERASE TEK?2.DBF 

USE •Smart<?uy J Fox£ASE + /Wacifox file, (clones dbf 
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USE TEMPI 
CCUOT TO TOT 
77 • TOUI of 
?? STR<TOT,4,0) 
77 ' donee* 
7 

•liet off fields number, L,D,F 2 R r potev nrt.-L. 

ERASE -TEMPI. DBF 
USE 73*?t£SXG 
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•Lifeecan menu; version 8-7-94 

SET TALK OFF 

eet device to screen 

CLEAR 

USE •EnartGW:FcxB&SB+/M«:xox files:clones .dbf ' 
STORE LUFDMXO TO Update 
GO BOTTOM 

STORE RECNOO TO cloneno 
STORE 6 TO Oiooaer 
00 WKH£ .T. 

* Progxanu: Lif«seq menu.fcnfc 

* Date..,.! 1/11/95 

Version.: FcdcEASE+/m&c, revision 1.10 

* Notes.... : Format file Lifeseg menu 



SCREEN 1 TYF£ 0 HEAD3N3 •Screen 2" AT fiO 2 sr?r r>fi* >e-> 

C PIXELS 18,126 TO 77.365 ST^S 26479 COLOR 3 27 67 3 5 i o o ^^/r^?* 1 -. "Geneva '.268 COLOR 0,6, 
6 PIXELS 110,29 TO 168,217 S1W£ 3871 CUM l™i'lHl^ 
0 PIXELS 45,161 SAY •LIFESEQ' £7^ 5553^ raSi .r 7 ii* 1 '" 1 

• PIXELS 36,269 SAY -wifi^ «g B ™ %2L2?T?~£*' ^0,0,-1^1,7135,5884 



• EOF: Lifeseg menu.fnc 

READ 

DO CASE 

CASE Chooser el 

^ a gg%$ m *'»''«» ™»>0*P>* Program .Norther, < eingie) . prp . 

USE 'Libraries. dbf • 

BROWSE 

CASE Chooaer*5 

^^tGu^Fcx^./Kac:^ £lle8i0Kput progTains , See ^ clcne . prff . 

m'SSSST^"^^ ^-"l^ri-o-tput prc^aenu.pr*- 
CLEAR 

SCREEN 1 OFF 

RETURN 

2HDCASB 

LOOP 
D2DD0 
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€1,30 SAY -Database Subset Analyeie* STVLE 65536 FONT 'Geneva' ,274 COLOR 0,0,0,-l,-l ( .l 

7 
? 

7 dateO 

?? 1 

77 TZKBO 

? 'Clone nunb«« ■ 

?? £TRIIKITXATE,6,0) 

?7 • through ' 

?? STR (TERMINATE, 6,0) 

7 'Libraries i 1 

IP DOTREei 

7 'All libraries 

D2DZF 

IT IOTIRE=2 
KARKsl 
DO WHILE -T. 
IF K?iRK>STO?IT 
DOT 

endif 

USE SELECTED 
GO HARX 
7 1 ' 

7? TOIM(libnaa») 
STORE MARX«1 TO NARX 
LOOP 
CtDDO 
ZNDIF 

? 'Designationei ' 

IF DnatchcO .AND. Hraetcb=0 .AND. OnatcheO 

?? 'All' 

ENDIF 

IF Snatch* 2 
?? 'Exact, ' 
DDI? 

IF Hmatchsl 
?? 'Human, ' 
ZNDIF 

IF Qmatch«l 
?? 'Otber «p. 1 
D9DIF 

IF CONXXKbI 

? 'Condensed xoraat aJialyaiB 1 
BtDIF 

IF AHAL-1 

7* 'Sorted by number' 

EMDIF 

IF ANAL*2 

? 'Sorted by Ercw 

ENDIF 

IF ANAU3 

? 'Arranged ty ABUNDANCE' 

ENDIF 

IF ANAU* 

? 'Sorted by INTEREST' 

OJDIF 

IF ANALe5 

? 'Arranged ty LOCATION' 

ENDIF 

IF ANAL-6 

7 'Arranged iy distribution ' 

INDIF 

IF ANAL»7 

7 'Arranged by FUNCTION' 
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? ''Total clones represented: 1 

77 STfUSTARTOT, 6,0) 

? 'Total clones &nalyzedi * 

?? £TR(W&LTDT,6,0) 

? 

? 
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USE 

coawr to tot 
?? 9 Total of 
?? £TO(T0T,4,0) 
7? • clones 1 



niBt eff fields xantoer.L.P.F.z.R.c^osTRy^CKlFTOR length Rr=>m tktt* t 
list off fislds nurtber,2-,D,F.Z.R,C,l^^ 



CLOSE DATABASES 
ERASE TEMPI. DBF 
USE TZMrDISIC 
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USE TEMPI 
COUNT TO TOT 
?? • Total of 
?? STOfTOT, 4 ,0) 
?? 1 clones! 
? . 

•list off fields number, L.D.P.z.R c rvrov r\rentt^ n-,.-.- 
USE TEMPDSSXG 
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•Northern (single), version 11-25-94 

close databases 

SET TALK OF? 

SET PRINT OFT 

SET EXACT 057 

CLEAR 

STOR E * 1 TO Eobject 

5^5 I ^ „ . ' TO Dcbject 

STORE 0 TO Numb J 

STOR E 0 TO Zog 

STORE 1 TO Bail 

DO WHILE 

* Program.: Northern (single) . £mt 

* Date : 8/ 8/94 

* Version.: FoxBASE+/Mao, revision 1.10 

* Notes : Format file Northern (single) 



SCREEN 1 TYPE 0 HEWING 'Screen 1' XT CO 2 CT9P iqc ' ac-> ~ _ 

9 PIXELS 15.81 TO 45.397 STVlTsMO C0U.R oT-?-2sSoV3^ ^ " Gej «v a M2 COLOR 0.0.0 
0 PIXELS B9.79 TO 192,422 STYLE 28447 COLOR 6 6 0 -HfiDtl 3 ' "? 

<? PIXELS 145, BS SAY 'Itocr«Kici^«Mir^5??^'E ™. 1 5;« 2 «JU» 0.0.0.-1,-1.-1 
• PIXELS 145.173 GET S^^T^S^ 2"SS IS^^iVnVi' 
6 PIXELS 35,89 SAY 'Single Northern eearch -«e^' ' LMjS 2 ^£°^ R °'6.0.-i.-i.-i 
6 PIXELS 220,162 GET Bail STYLE MB WW 'Chi CBCC^12^PlCTO^S^a* R r^T* '.274 COLOR 0,0.- 
6 PIXELS 175,96 SAY "Clone style p™£ .? . 9 R Coa cinue;Bail out' SIZE 

6 PIXELS 175 173 GET ^ Sr^T^SnS^ 7?«f™' S' S' J 1 ' ■ 

.. PIXELS 60,152 SAY -E^r OKE o f U^o&P^'Ls^^i^ilS COLOR 



* EOF: Northern (single) . £mt 
READ 

IF Bail*2 
CLEAR 

scre en 1 off 
RZJURN 

naacp 

USE •SrcartGuy : FoxBASE* /Mac : Fox files rLookuo. dbf 
SET TALK 'CM *hh-uoi 



IF Eobjecto' . • 

STORE UPPER (Eobject) to Eobject 

SET SAFETY OFF 

SORT ON Entry TO 'Lookup entry, dbf' 

SET SAFE1Y ON 

USE "Lookup entry. dbf' 

LCCATC FOR Look-Eobject 

IF .NOT.FOUNDO 

OAR 

LOOP 

END3T 

BROWSE 

STORE &try TO Searchval 

CLOSE DATABASES 

E RAS "LootaJp 'entry. dbf' 

ENDIF 



IF Dobjecto* » 
SET EXACT OFF 
SET SAFETY OFF 

SORT ON descriptor TO ■Lookup descriptor. dbf • 
SET SAFETY On 

USE •LoOojp descriptor . dbf • 

LOCATE FOR UPPER (TRIM (descriptor) ) =UPPER(raXM(Dobiect ) 5 

IF .NOT.FCUNDU 

CLEAR 
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U00P 

ENDIF 

BROWSE 

STORE Entry TO Searchval 

CLOSE DATABASES 

ERASE "Loo)c^ descriptor. dhf ■ 

SET EXACT ON 

ENDIF 

IF NircboO 

USE ' Sru&rtGuy : FcxBASE+ /Mac : Fox filesiclenes .dbf ' 

GO Kunib 

BROWSB 

STORE Entry TO Searchval' 
HSDIF 

CLEAR 

? 'Northern analysis for entry ' 

?? Searchval 

? 

? 'Enter Y to proceed' 

WAIT TO OK 

CLEAR 

IF UPPZR(OK) <> 'Y 1 
screen I off 
RETURN 

noiF 

* COMPRESSION SUBROUTINE FOR Library, dbf 
? 'Conpressinc- the Libraries file now, . . * 

USE • Smart Ciry : FoxBASE+ /Mac i Fox f iles: libraries. dbf ' 
SET SAFETY OFF 

SORT CK library TO "ConpreBeed libraries, dhf ■ 

* roR cntercd>0 
SET SAFETY ON 

USE *Cenpxessed libraries .dbf* 

DELETE FOR entered* 0 

PACK 

COUNT TO TOT* 
MARTI c 1 
SW2»0 

DO WHILE SW2.0 ROLL 

IF MARK1 >« TOT 

PACK 

SW2-1 

LOOP 

ENDIF 
GO MARX1 

STORE library TO TESTA 
SKIP 

STORE Library TO TESTB 
IF TESTA s TESTB 
DELETE 
ENDIF 

MARK1 . MARK1+1 
LOOP 

DtDDO ROLL 

* Northern analysis 
CLEAR 

? 'Doing the northern now. . . * 
SET TALK ON 

USE •SmartGuy:FoxSASE*/Mac:Fox files : clones .dbf • 
SET SAFETY OFF 

COPY TO # Hits.dbf ■ FOR entry- searchval 
SET SAFETY ON 
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CLOSE DATABASES 
SELECT 1 

USE •Ccnpressed libraries. dbf 
STORE KECCOONTO TO Bitrifcfl 

USE ■Hits.dbf' 

Martol 

DO WHILE .T. 

SELECT 1 

IF Mark>£fe£rie8 

EXIT 

BJDIF 

GO MARK 

STORE library TO Jigger 
SELECT 2 

COUNT TO log FOR library=Jigyer 
SELECT 1 

REPLACE hits with 2og 

Mar)uKark+l 

LOOP 

EMDDO ' 



SELECT 1 

BRCWSE FIELDS LIBRARY , LIENAME , 3EOTER2D, KITS AT 0 0 

CLEAR ' 

? 'Enter Y to print: • 

WAIT TO PRINSET 

IT UPPER (PRINSET)«'Y< 

SET PRINT ON 

CLEAR 

E3ECT* 

SCREEN 1 TVTE 0 HEADING ■Screen 1" AT 40.2 £72£ ?b< ao mvtt c *~ 

i 'DATABASE ENTRIES MATCHING ENTRY ■ ' " ***** m "GeuevaM* COLOR 0,0,0 

?? Sesrchval 
? DATE O 

? 
? 

SELECT 2 

SETPRIOT OFF 
DJDIF 

CLOSE DATABASES 
SET TALK OFF 
CLEAR 

DO f Test print .pry' 
RETURN 
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TABLE 6 



library libname 
ADENINB01 Inliameo adenoid 
ADRENOR01 Adrenal oland (t) 
ADREN0TD1 Adrenal gland (T) 
AMLBMOT01 AMU btAst eetls (T) 
BMAfWOTDI Bon* merrow 
BUARNOT02 Bone marrow (T) 
CAKDN0TQ1 Cardiac muscle (X) 
CHAONOTD1 Chin, hamster ovary 
CORNhtOTDl Corneal stroma 
nafWOTDI Fibrobieet, ATS 
FJBRAGTD2 Fibroblast. AT 30 
FlEnANTOI Fibroblast AT 
FIHSNGTOI Fibre blast uv 5 
RSRNGTO Fibroblast, uv 30 
FIERNOT01 Fibroblast 
FlBaNOTO Fibroblast, normal 
HMC1NOTD1 Masl cell line HMC-1 
HUVELPBOt HUVEC IFNTNF.LPS 
HUVENO801 HUVEC control 
HUVESTB01 HUVEC shear stress 
HYFONOB01 Hypolhelemua 
KIDNNOT01 Kidney (T) 
UVRNOTD1 Liver (T) 
LUNGNOTtJI UingO) 
MU5CMOT01 SkeleteJ rnusde (7) 
OVIDMD601 Oviduct 
PANCNOT01 Pancreas, normal 
FrTUf4OR0i Pituitary (r) 
PrTUNOTOI Plluitary (T) 
PLACNOB01 Ploeenia 
SJNTNOTD2 Smell inteetine (7) 
SPLNFET01 Bpleervrliver, teiel 
SPLNNOT02 Spleen 0) 
STOWNOTDl Slomach 
6VNORAB01 Rheum, synovium 
TBLVWOTO1 T 4 B tymphoblaai 
TESTNOTOi Tesrie fT) 
THP1NOB01 THP-1 control 
"TXP1PEB01 THP phorboJ 
THP1PLB01 THM phorbol LPS 
U937NOT01 U937, monoeyilc leuk 



number library 

3304 UB37NOT01 

3240 HMC1NOT01 

3269 HMC1NOT01 

4*93 HMC1NOT01 

8569 HMC1NOT01 

9139 HMC1NOT01 



d a f a r enlry 
E H C C T HUME FSB 
E H C C T HUMEF1B 
E H C C T HUMEFlB 
E H C C 7 HUMEFlB 
E H C C T HUMEFlB 
E H C C T HUMEF1B 



descriptor 

Eionoatlon I a dor 1-beta 
Elongation lactor 1-bota 
Elongation (actor 1-bat* 
E long* lion factor i-beta 
Elcnoetion lanor i*beta 
Elonoaiion fscior 1-bete 



rfattrieiari 


rf end 


U- 0 


773 


0 370 


773 


0 371 


773 


0 470 


773 


0 327 


773 


0 375 


773 
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WHAT IS CLAIMED J S* — 

1. A method of analyzing a specimen containing gene 
transcripts, said method comprising the steps of: 

(a) producing a library of biological sequences; 
5 (b) generating a set of transcript sequences, where 

each of the transcript sequences in said set is indicative 
of a different one of the biological sequences of the 
library; 

(c) processing the transcript sequences in a 
programmed computer in which a database of reference 
transcript sequences indicative of reference biological 
sequences is stored, to generate an identified sequence 
value for each of the transcript sequences, where each said 
identified sequence value is indicative of a sequence 
annotation and a degree of match between one of the 
transcript sequences and at least one of the reference 
transcript sequences; and 

(d) processing each said identified sequence value to 
generate final data values indicative of a number of times 
each identified sequence value is present in the library. 

2. The method of claim i, wherein step (a) includes 
the steps of: 

obtaining a mixture of mRNA; 
making cDNA copies of the mRNA; 
isolating a representative population of clones 
transfected with the cDNA and producing therefrom the 
library of biological sequences. 

3. The method of claim i, wherein the biological 
sequences are cDNA sequences. 

30 4. The method of claim i, wherein the biological 

sequences are RNA sequences. 

5. The method of claim l, wherein the biological 
sequences are protein sequences. 
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6. The method of claim 1, wherein a first value of 
said degree of match is indicative of an exact match, and a 
second value of said degree of match is indicative of a 
non-exact match. 



5 7. A method of comparing two specimens containing 

gene transcripts, said method comprising: 

(a) analyzing a first specimen according to the 
method of claim 1; 

(b) producing a second library of biological 
10 sequences; 

(c) generating a second set of transcript sequences, 
where each of the transcript sequences in said second set 
is indicative of a different one of the biological 
sequences of the second library; 

15 * (d > Processing the second set of transcript sequences 

in said programmed computer to generate a second set of 
identified sequence values known as further identified 
sequence values, where each of the further identified 
seguence values is indicative of a sequence annotation and 

20 a degree of match between one of the biological sequences 
of the second library and at least one of the reference 
sequences; 

(e) processing each said further identified sequence 
value to generate further final data values indicative of a 

25 number of times each further identified sequence value is 
present in the second library; and 

(f) processing the final data values from the first 
specimen and the further identified sequence values from 
the second specimen to generate ratios of transcript 

30 sequences, each of said ratio values indicative of 

differences in numbers of gene transcripts between the two 
specimens. 

8. A method of quantifying relative abundance of mRNA 
in a biological specimen, said method comprising the steps 
35 of: 

(a) isolating a population of mRNA transcripts from 
the biological specimen; 
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(b) identifying genes from which the mRNA was 
transcribed by a sequence-specific method; 

(c) determining numbers of mRNA transcripts 
corresponding to each of the genes; and 

5 (d) using the mRNA transcript numbers to determine 

the relative abundance of mRNA transcripts within the 
population of mRNA transcripts. 

9. A diagnostic method which comprises producing a 
gene transcript image, said method comprising the steps of: 
10 (a) isolating a population of mRNA transcripts from a 

biological specimen; 

(b) identifying genes from which the mRNA was 
transcribed by a sequence-specific method; 

(c) determining numbers of mRNA transcripts 
15 corresponding to each of the genes; and 

(d) using the mRNA transcript numbers to determine 
the relative abundance of mRNA transcripts within the 
population of mRNA transcripts, where data determining the 
relative abundance values of mRNA transcripts is the gene 

20 transcript image of the biological specimen. 

10. The method of claim 9, further comprising: 

(e) providing a set of standard normal and diseased 
gene transcript images; and 

(f) comparing the gene transcript image of the 

25 biological specimen with the gene transcript images of step 
(e) to identify at least one of the standard gene 
transcript images which most closely approximate the gene 
transcript image of the biological specimen. 

11. The method of claim 9, wherein the biological 
30 specimen is biopsy tissue, sputum, blood or urine. 

12. A method of producing a gene transcript image, 
said method comprising the steps of 

(a) obtaining a mixture of mRNA; 

(b) making cDNA copies of the mRNA; 
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(c) inserting the cDNA into a suitable vector and 
using said vector to transfect suitable host strain cells 
which are plated out and permitted to grow into clones, 
each clone representing a unique mRNA; 

5 < d ) isolating a representative population of 

recombinant clones; 

(e) identifying amplified cDNAs from each clone in 
the population by a sequence-specific method which 
identifies gene from which the unique mRNA was transcribed; 

(f) determining a number of times each gene is 
represented within the population of clones as an 
indication of relative abundance; and 

(g) listing the genes and their relative abundance in 
order of abundance, thereby producing the gene transcript 

15 image. 

13. The method of claim 12, also including the step 
of diagnosing disease by: 

repeating steps (a) through (g) on biological 
specimens from random sample of normal and diseased humans 
encompassing a variety of diseases, to produce reference 
sets of normal and diseased gene transcript images; 

obtaining a test specimen from a human, and producing 
a test gene transcript image by performing steps (a) 
through (g) on said test specimen; 

comparing the test gene transcript image with the 
reference sets of gene transcript images; and 

identifying at least one of the reference gene 
transcript images which most closely approximates the test 
gene transcript image. 

30 14. A computer system for analyzing a library of 

biological sequences, said system including: 

means for receiving a set of transcript sequences, 
where each of the transcript sequences is indicative of a 
different one of the biological sequences of the library 

35 and *' 

means for processing the transcript sequences in the 
computer system in which a database of reference transcript 
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sequences indicative of reference biological-sequences is 
stored, wherein the computer is programmed with software 
for generating an identified sequence value for each of the 
transcript sequences, where each said identified sequence 
5 value is indicative of a sequence annotation and a degree 
of match between a different one of the biological 
sequences of the library and at least one of the reference 
transcript sequences, and for processing each said 
identified sequence value to generate final data values 
10 indicative of a number of times each identified sequence 
value is present in the library. 

15. The system of claim 14, also including: 
library generation means for producing the library of 

biological sequences and generating said set of transcript 
15 sequences from said library. 

16. The system of claim 15 , wherein the library 
generation means includes: 

means for obtaining a mixture of mRNA ; 
means for making cDNA copies of the mRNA; 
means for inserting the cDNA copies into cells and 
permitting the cells to grow into clones; 

means for isolating a representative population of the 
clones and producing therefrom the library of biological 
sequences. 
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